Updates

Saturday, 17 August 2013

String in java - A closer look

when I have told about concatenation of strings in java I'm wondered that this post was still in my draft so adding these lines at the starting of this post.
 String is one of the widely used java classes. It is special in java as it has some special characteristics than a usual java class.
Though I know you guys might be knowing most of the things, this will help you to recall them and I am sure you will find one or two new things. I recommend you to take your time and read through this completely as java String is a basic building block of your java programs.

Immutable Java String

Java String is a immutable object. For an immutable object you cannot modify any of its attribute’s values. Once you have created a java String object it cannot be modified to some other object or a different String. References to a java String instance is mutable. There are multiple ways to make an object immutable. Simple and straight forward way is to make all the attributes of that class as final. Java String has all attributes marked as final except hash field.
Java String is final. I am not able to nail down the exact reason behind it(somehow later). But my guess is, implementors of String didn’t want anybody else to mess with String type and they wanted de-facto definition for all the behaviours of String.

Java String Instantiation

In continuation with above discussion of immutability of java String we shall see how that property is used for instantiating a Sting instance. JVM maintains a memory pool for String. When you create a String, first this memory pool is scanned. If the instance already exists then this new instance is mapped to the already existing instance. If not, a new java String instance is created in the memory pool.
This approach of creating a java String instance is in sync with the immutable property. When you use ‘new’ to instantiate a String, you will force JVM to store this new instance is fresh memory location thus bypassing the memory map scan. Inside a String class what you have got is a char array which holds the characters in the String you create.

Following are some of the ways to instantiate a java String

String str1 = "javaphobia";
String str2 = new String("Hello");
String str3 = new String(char []);
String str4 = new String(byte []);
String str5 = new String(StringBuffer);
String str6 = new String(StringBuilder);
We have an empty constructor for String. It is odd, java String is immutable and you have an empty constructur which does nothing but create a empty String. I don’t see any use for this constructor, because after you create a String you cannot modify it.

Java String Comparison


Do not use == operator to compare java String. It compares only the object references and not its contents. If you say, “if references are same, then the value must be same” – this doesn’t cover, “even if references are not same, the content can be same”. Therefore == operator doeenot 100% guarantee the equality of java String. Consider,
String strArray1 = new String(“come”, “came”);
“come” == strArray1[0]; gives FALSE.
  • You claim that, using == operator with intern will give right result but it is not necessary.
  • For equality comparison in java String, simplest and easiest way is to go with equals() method.
“come”.equals(strArray1[0]); gives TRUE.
equals() method is part of Object class. Java String class overrides it and provides implementation for equality comparison. String’s equals() implemetation uses three step process to compare two java String:
  1. Compare references (if both String references are same return true else continue)
  2. Compare length (if both String length are not same return false else continue)
  3. Compare character by character sequentially.
Shall we always use equals() method for equality comparison in any type of objects? No. You need to check the implementation in its respective class. StringBuffer and StringBuilder do not have an implementation for equals() method.
Use equalsIgnoreCase(String) to compare String irrespective of case (Case insensitivity).
“javaString”.equalsIgnoreCase(“JAVASTRING”); returns TRUE.

Java String Conversion

Java String conversion is a huge topic by itself. I will take only String as scope for this section.
  • + operator can be used to perform String conversion.
  • If + operator is used with two int primitives, it returns sum the two numbers.
  • To use it as a concatenation operator either one of the operand should be a java String.
Example: 1+” java”; results in “1 java”.
So how does the above happen? When + operator is used with a java primitive, following happens:
  1. Use respective type class and convert the primitive to an object. Like 1 -> new Integer(1);
  2. Invoke toString() of that respective class.
  • toString() is a method that belongs to Object class. Every wrapper classes implements toString() method which returns a String object of the passed primitive.
  • If you specifically target converting number to String, apart from + operator we can use printf() doing format conversion.

What is intern()

Hope you remember about the memory pool of java String discussed in above paras . We have a method named intern() in java String. When you invoke this method on a String,
  1. it checks if the same String is available in memory pool.
  2. If it exists, returns it.
  3. Else, adds this String to the memory pool and returns the String.

Java String Transformation

How to transform a String to upper / lower case. Java String has got nice utility methods.
toLowerCase(Locale locale)
toUpperCase(Locale locale)
and String can be trimmed using trim()

String Concatenation in java


You have been told many times, don't use + (java plus operator) to concatenate Strings. We all know that it is not good for performance. Have you researched it? Do you know what is happening behind the hood? Lets explore all about String concatenation now.
In the initial ages of java around jdk 1.2 every body used + to concatenate two String literals. When I say literal I mean it. Strings are immutable. That is, a String cannot be modified. Then what happens when we do ?
For example -
String data = "Hello";
data = data + "World" ;
In the above java code snippet for String concatenation, it looks like the String is modified. It is not happening. Until JDK 1.4 StringBuffer is used internally and from JDK 1.5 StringBuilder is used to concatenate. After concatenation the resultant StringBuffer or StringBuilder is changed to String.
When a java experts say, “don’t use + but use StringBuffer”. If + is going to use StringBuffer internally what big difference it is going to make in String concatenation?
Look at the following example. I have used both + and StringBuffer as two different cases. In case 1, I am just using + to concatenate. In case 2, I am changing the String to StringBuffer and then doing the concatenation. Then finally changing it back to String. I used a timer to record the time taken for an example String concatenation.

/***************    Example   **************** */

class Clock {

  private final long startTime;

  public Clock() {
    startTime = System.currentTimeMillis();
  }

  public long getElapsedTime() {
    return System.currentTimeMillis() - startTime;
  }
}

public class StringConcatenationExample {

  static final int N = 47500;

  public static void main(String args[]) {

    Clock clock = new Clock();

    //String to be used for concatenation
    String string1 = "";
    for (int i = 1; i <= N; i++) {

      //String concatenation using +
      string1 = string1 + "*";
    }
    //Recording the time taken to concatenate
    System.out.println("Using + Elapsed time: " + clock.getElapsedTime());

    clock = new Clock();
    StringBuffer stringBuffer = new StringBuffer();
    for (int i = 1; i <= N; i++) {

      //String concatenation using StringBuffer
      stringBuffer.append("*");
    }
    String string2 = stringBuffer.toString();
    System.out.println("Using StringBuffer Elapsed time: " + clock.getElapsedTime());

  }
}
/**************** END *******************/

Look at the output (if you run this java program the result numbers might slightly vary based on your hardware / software configuration). The difference between the two cases is astonishing.
Output For The Above Example Program For String Concatenation
Using + Elapsed time: 3687
Using StringBuffer Elapsed time: 16
My argument is, if + is using StringBuffer internally for concatenation, then why is this huge difference in time? Let me explain that, when a + is used for concatenation see how many steps are involved:
  1. A StringBuffer object is created
  2. string1 is copied to the newly created StringBuffer object
  3. The “*” is appended to the StringBuffer (concatenation)
  4. The result is converted to back to a String object.
  5. The string1 reference is made to point at that new String.
  6. The old String that string1 previously referenced is then made null.
Therefore you can see initially it was +, then StringBuffer came and now StringBuilder.

How To Read Input From Console in Java (Part 2)

In the previous part I have discussed about streams(Input and Output), PrintStream, BufferedReader, InputStreamReader Class and publicized how to take input from console.
In these section I'm going to discuss about One more class called Scanner.Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods.
in regular words.Scanner reads formatted input and converts it into binary form. Addition of Scanner class in JDK 5 makes it easier now to read all types of numeric values, Strings, and other types of data ,whether it comes from disk file, the Keyboard or from another source.
The structure of the Scanner class is

public final class Scanner
extends Object
implements Iterator<String>

There are two constructors that are particularly useful: one takes an InputStream object
as a parameter and the other takes a FileReader object as a parameter.

Scanner in = new Scanner(System.in);  // System.in is an InputStream
Scanner inFile = new Scanner(new FileReader("myFile"));

If the file ≥myFile≤ is not found, a FileNotFoundException is
thrown. This is a checked exception, so it must be caught or forwarded by
putting the phrase ≥throws FileNotFoundException≤ on the header of the method
in which the instantiation occurs and the header of any method that calls the
method in which the instantiation occurs.

Numeric and String Methods

Method
Returns
int nextInt()
Returns the next token as an int. If the next token is not an integer,InputMismatchException is thrown.
long nextLong()
Returns the next token as a long. If the next token is not an integer,InputMismatchException is thrown.
float nextFloat()
Returns the next token as a float. If the next token is not a float or is out of range, InputMismatchException is thrown.
double nextDouble()
Returns the next token as a long. If the next token is not a float or is out of range, InputMismatchException is thrown.
String next()
Finds and returns the next complete token from this scanner and returns it as a string; a token is usually ended by whitespace such as a blank or line break. If not token exists,NoSuchElementException is thrown.
String nextLine()
Returns the rest of the current line, excluding any line separator at the end.
void close()
Closes the scanner.
The Scanner looks for tokens in the input. A token is a series of characters
that ends with what Java calls whitespace. A whitespace character
can be a blank, a tab character, a carriage return, or the end of the file.

Thus, if we read a line that has a series of numbers separated by blanks, the
scanner will take each number as a separate token. Although we have only shown four numeric 
methods, each numeric data type has a corresponding method that reads values of that type.

The numeric values may all be on one line with blanks between each value or may be on separate
lines. Whitespace characters (blanks or carriage returns) act as separators.The next method returns
the next input value as a string, regardless of what is keyed.  

For example,given the following code segment.
  int number = in.nextInt();
  float real = in.nextFloat();
  long number2 = in.nextLong();
  double real2 = in.nextDouble();
  String string = in.next();


Here is a program that uses these methods, followed by the output.  Look over the application carefully to be sure you understand how the output was generated.
//**********************************************************************
// Class NumericInput demonstrates reading numeric values.
//**********************************************************************

import java.util.Scanner;
import java.io.*;          // Access System.out

public class NumericInput
{
  public static void main(String[] args)
  {
    // Declarations
    Scanner in = new Scanner(System.in);

    int integer;
    long longInteger;
    float realNumber;
    double doubleReal;

    String string1;
    String string2;

    // Prompts
    System.out.println("Enter an integer, a long integer, " + "a floating-point ");
 System.out.println("number, another floating-point number, ""and a string.");
 System.out.println("Separate each with a blank or return.");   
 // Read in values  
    integer = in.nextInt();
    longInteger = in.nextLong();
    realNumber = in.nextFloat();
    doubleReal = in.nextDouble();
    string1 = in.nextLine();
    System.out.println("Now enter another value.");
    string2 = in.next();

    System.out.println("Here is what you entered: ");
    System.out.println(integer + " " + longInteger + " " + realNumber + " " + doubleReal + " " 
        + string1 + " and " + string2);
  }
}

Output:

Enter an integer, a long integer, a floating-point
number, another floating-point number, and a string.
Separate each with a blank or return.
23
24
25.0 233333333333333.444 Hello

Now enter another value.
23.4
Here is what you entered:
23 24 25.0 2.3333333333333344E14  Hello and 23.4

What would happen if there were no token in the file in the  previous example? 
Each of the boolean methods would return false. They return true if and only if the next token in the 
scanner can be interpreted as a value of their type. We return to the subject of reading data from 
files later in this chapter and show how to use these Scanner methods to allow us to read multiple 
values from a line in a file. Except for some trivial cases, we must

combine reading operations with loops to read through all of the data on a file.

Files

To read from a file rather than the keyboard, you instantiate a Scanner object
with a FileReader object rather than System.in. 

Scanner in = new Scanner(System.in);   // Reading from the keyboard
Scanner inFile = new Scanner(new FileReader(≥inFile.dat≤));   // Reading from a file

Although all of the methods applied to keyboard input can be applied to file input,
there are methods that are usually applied only to files. These are the methods that
ask of there are more values in the file. If there are no more values in a file, we say that
the file is at the end of the file (EOF).  For example,
inFile.hasNext();
inFile.hasNextLine();

return true if inFile has another token in the file or if there is another line in the file.
What about the methods hasNextInt and so forth that we used to look ahead at the
type of the next input token? These can be used to determine if there are more data values
in the file, provided you know exactly how the files are organized
Be sure to close all files. If you forget to close System.in,
no harm is done, but forgetting to close a file can cause problems.


                                  Difference between Scanner and BufferedReader

Though both are meant for standard input but Scanner is used for parsing tokens from the contents of the stream while BufferedReader just reads the stream and does not do any special parsing.

·         BufferedReader is synchronized and Scanner is not, so its up to you to decide.
·         The Scanner has a little buffer (1KB char buffer) as opposed to the BufferedReader (8KB byte buffer), but it's more than enough.
·         Scanner is memory/cpu heavy (at least when compared to BufferedReader) because it internally uses "regular expressions" for matching your "nextXXX" as opposed to just reading everything till the end of line as in the case of a regular Reader.

·         Scanner can use tokenize using custom delimiter and parse the stream into primitive types of data, while BufferedReader can only read and store String.

Friday, 16 August 2013

Java Access Modifiers

                        Introduction to Java Access Modifiers
The access to classes, constructors, methods and fields are regulated using access modifiers i.e. a class can control what information or data can be accessible by other classes. To take advantage of encapsulation, you should minimize access whenever possible.
Java provides a number of access modifiers to help you set the level of access you want for classes as well as the fields, methods and constructors in your classes. A member has package or default accessibility when no accessibility modifier is specified.
Java comes with four access specifiers. They are
  1. public
  2. protected
  3. default
  4. private

I) Class level access modifiers (java classes only)
Only two access modifiers is allowed, public and no modifier
  • If a class is ‘public’, then it CAN be accessed from ANYWHERE.
  • If a class has ‘no modifer’, then it CAN ONLY be accessed from ‘same package’.
II) Member level access modifiers (java variables and java methods)
All the four public, private, protected and no modifer is allowed.
  • public and no modifier – the same way as used in class level.
  • private – members CAN ONLY access.
  • protected – CAN be accessed from ‘same package’ and a subclass existing in
    any package can access.
 For better understanding, member level access is formulated as a table:


Access Modifiers
Same Class
Same Package
Subclass
Other packages
public
Y
Y
Y
Y
protected
Y
Y
Y
N
no access modifier(default)
Y
Y
N
N
private
Y
N
N
N
First row {public Y Y Y Y} should be interpreted as:
  • Y – A member declared with ‘public’ access modifier CAN be accessed by the members of the ‘same class’.
  • Y – A member declared with ‘public’ access modifier CAN be accessed by the members of the ‘same package’.
  • Y – A member declared with ‘public’ access modifier CAN be accessed by the members of the ‘subclass’.
  • Y – A member declared as ‘public’ CAN be accessed from ‘Other packages’

Second row {protected Y Y Y N} should be interpreted as:
  • Y – A member declared with ‘protected’ access modifier CAN be accessed by the members of the ‘same class’.
  • Y – A member declared with ‘protected’ access modifier CAN be accessed by the members of the ‘same package’.
  • Y – A member declared with ‘protected’ access modifier CAN be accessed by the members of the ‘subclass’.
  • N – A member declared with ‘protected’ access modifier CANNOT be accessed by the members of the ‘Other package’.

similarly interpret the access modifiers table for the third (no access modifier) and fourth (private access modifier) records.