Java String Concatenation and Performance

The quick and dirty way to concatenate strings in Java is to use the concatenation operator (+). This will yield a reasonable performance if you need to combine two or three strings (fixed-size). But if you want to concatenate n strings in a loop, the performance degrades in multiples of n. Given that String is immutable, for large number of string concatenation operations, using (+) will give us a worst performance. But how bad ? How StringBuffer, StringBuilder or String.concat() performs if we put them on a performance test ?. This article will try to answer those questions.

We will be using Perf4J to calculate the performance, since this library will give us aggregated performance statistics like mean, minimum, maximum, standard deviation over a set time span. In the code, we will concatenate a string (*) repeatedly 50,000 times and this iteration will be performed 21 times so that we can get a good standard deviation. The following methods will be used to concatenate strings.

And finally we will look at the byte code to see how each of these operations perform. Let’s start building the class. Note that each of the block in the code should be wrapped around the Perf4J library to calculate the performance in each iteration. Let’s define the outer and inner iterations first.

private static final int OUTER_ITERATION=20;
private static final int INNER_ITERATION=50000;

Now let’s implement each of the four methods mentioned in the article. Nothing fancy here, plain implementations of (+), String.concat(), StringBuffer.append() & StringBuilder.append().

String addTestStr = "";
String concatTestStr = "";
StringBuffer concatTestSb = null;
StringBuilder concatTestSbu = null;

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringAddConcat");
    addTestStr = "";
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        addTestStr += "*";
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringConcat");
    concatTestStr = "";
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestStr = concatTestStr.concat("*");
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringBufferConcat");
    concatTestSb = new StringBuffer();
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestSb.append("*");
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringBuilderConcat");
    concatTestSbu = new StringBuilder();
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestSbu.append("*");
    stopWatch.stop();
}

Let’s run this program and generate the performance metrics. I ran this program in a 64-bit OS (Windows 7), 32-bit JVM (7-ea), Core 2 Quad CPU (2.00 GHz) with 4 GB RAM.

The output from the 21 iterations of the program is plotted below.

Well, the results are pretty conclusive and as expected. One interesting point to notice is how better String.concat performs. We all know String is immutable, then how the performance of concat is better. To answer the question we should look at the byte code. I have included the whole byte code in the download package, but let’s have a look at the below snippet.

45: new #7; //class java/lang/StringBuilder
48: dup
49: invokespecial #8; //Method java/lang/StringBuilder."<init>":()V
52: aload_1
53: invokevirtual #9; //Method java/lang/StringBuilder.append:
    (Ljava/lang/String;)Ljava/lang/StringBuilder;
56: ldc #10; //String *
58: invokevirtual #9; //Method java/lang/StringBuilder.append:
    (Ljava/lang/String;)Ljava/lang/StringBuilder;
61: invokevirtual #11; //Method java/lang/StringBuilder.toString:()
    Ljava/lang/String;
64: astore_1

This is the byte code for String.concat(), and its clear from this that the String.concat is using StringBuilder for concatenation and the performance should be as good as String Builder. But given that the source object being used is String, we do have some performance loss in String.concat.

So for the simple operations we should use String.concat compared to (+), if we don’t want to create a new instance of StringBuffer/Builder. But for huge operations, we shouldn’t be using the concat operator, as seen in the performance results it will bring down the application to its knees and spike up the CPU utilization. To have the best performance, the clear choice is StringBuilder as long as you do not need thread-safety or synchronization.

Full source of this application is available in my github page.