Updates

Saturday 17 August 2013

String Concatenation in java


You have been told many times, don't use + (java plus operator) to concatenate Strings. We all know that it is not good for performance. Have you researched it? Do you know what is happening behind the hood? Lets explore all about String concatenation now.
In the initial ages of java around jdk 1.2 every body used + to concatenate two String literals. When I say literal I mean it. Strings are immutable. That is, a String cannot be modified. Then what happens when we do ?
For example -
String data = "Hello";
data = data + "World" ;
In the above java code snippet for String concatenation, it looks like the String is modified. It is not happening. Until JDK 1.4 StringBuffer is used internally and from JDK 1.5 StringBuilder is used to concatenate. After concatenation the resultant StringBuffer or StringBuilder is changed to String.
When a java experts say, “don’t use + but use StringBuffer”. If + is going to use StringBuffer internally what big difference it is going to make in String concatenation?
Look at the following example. I have used both + and StringBuffer as two different cases. In case 1, I am just using + to concatenate. In case 2, I am changing the String to StringBuffer and then doing the concatenation. Then finally changing it back to String. I used a timer to record the time taken for an example String concatenation.

/***************    Example   **************** */

class Clock {

  private final long startTime;

  public Clock() {
    startTime = System.currentTimeMillis();
  }

  public long getElapsedTime() {
    return System.currentTimeMillis() - startTime;
  }
}

public class StringConcatenationExample {

  static final int N = 47500;

  public static void main(String args[]) {

    Clock clock = new Clock();

    //String to be used for concatenation
    String string1 = "";
    for (int i = 1; i <= N; i++) {

      //String concatenation using +
      string1 = string1 + "*";
    }
    //Recording the time taken to concatenate
    System.out.println("Using + Elapsed time: " + clock.getElapsedTime());

    clock = new Clock();
    StringBuffer stringBuffer = new StringBuffer();
    for (int i = 1; i <= N; i++) {

      //String concatenation using StringBuffer
      stringBuffer.append("*");
    }
    String string2 = stringBuffer.toString();
    System.out.println("Using StringBuffer Elapsed time: " + clock.getElapsedTime());

  }
}
/**************** END *******************/

Look at the output (if you run this java program the result numbers might slightly vary based on your hardware / software configuration). The difference between the two cases is astonishing.
Output For The Above Example Program For String Concatenation
Using + Elapsed time: 3687
Using StringBuffer Elapsed time: 16
My argument is, if + is using StringBuffer internally for concatenation, then why is this huge difference in time? Let me explain that, when a + is used for concatenation see how many steps are involved:
  1. A StringBuffer object is created
  2. string1 is copied to the newly created StringBuffer object
  3. The “*” is appended to the StringBuffer (concatenation)
  4. The result is converted to back to a String object.
  5. The string1 reference is made to point at that new String.
  6. The old String that string1 previously referenced is then made null.
Therefore you can see initially it was +, then StringBuffer came and now StringBuilder.

No comments:

Post a Comment