Java String Concatenation and Performance

The quick and dirty way to concatenate strings in Java is to use the concatenation operator (+). This will yield a reasonable performance if you need to combine two or three strings (fixed-size). But if you want to concatenate n strings in a loop, the performance degrades in multiples of n. Given that String is immutable, for large number of string concatenation operations, using (+) will give us a worst performance. But how bad ? How StringBuffer, StringBuilder or String.concat() performs if we put them on a performance test ?. This article will try to answer those questions.

We will be using Perf4J to calculate the performance, since this library will give us aggregated performance statistics like mean, minimum, maximum, standard deviation over a set time span. In the code, we will concatenate a string (*) repeatedly 50,000 times and this iteration will be performed 21 times so that we can get a good standard deviation. The following methods will be used to concatenate strings.

And finally we will look at the byte code to see how each of these operations perform. Let’s start building the class. Note that each of the block in the code should be wrapped around the Perf4J library to calculate the performance in each iteration. Let’s define the outer and inner iterations first.

private static final int OUTER_ITERATION=20;
private static final int INNER_ITERATION=50000;

Now let’s implement each of the four methods mentioned in the article. Nothing fancy here, plain implementations of (+), String.concat(), StringBuffer.append() & StringBuilder.append().

String addTestStr = "";
String concatTestStr = "";
StringBuffer concatTestSb = null;
StringBuilder concatTestSbu = null;

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringAddConcat");
    addTestStr = "";
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        addTestStr += "*";
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringConcat");
    concatTestStr = "";
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestStr = concatTestStr.concat("*");
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringBufferConcat");
    concatTestSb = new StringBuffer();
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestSb.append("*");
    stopWatch.stop();
}

for (int outerIndex=0;outerIndex<=OUTER_ITERATION;outerIndex++) {
    StopWatch stopWatch = new LoggingStopWatch("StringBuilderConcat");
    concatTestSbu = new StringBuilder();
    for (int innerIndex=0;innerIndex<=INNER_ITERATION;innerIndex++)
        concatTestSbu.append("*");
    stopWatch.stop();
}

Let’s run this program and generate the performance metrics. I ran this program in a 64-bit OS (Windows 7), 32-bit JVM (7-ea), Core 2 Quad CPU (2.00 GHz) with 4 GB RAM.

The output from the 21 iterations of the program is plotted below.

Well, the results are pretty conclusive and as expected. One interesting point to notice is how better String.concat performs. We all know String is immutable, then how the performance of concat is better. To answer the question we should look at the byte code. I have included the whole byte code in the download package, but let’s have a look at the below snippet.

45: new #7; //class java/lang/StringBuilder
48: dup
49: invokespecial #8; //Method java/lang/StringBuilder."<init>":()V
52: aload_1
53: invokevirtual #9; //Method java/lang/StringBuilder.append:
    (Ljava/lang/String;)Ljava/lang/StringBuilder;
56: ldc #10; //String *
58: invokevirtual #9; //Method java/lang/StringBuilder.append:
    (Ljava/lang/String;)Ljava/lang/StringBuilder;
61: invokevirtual #11; //Method java/lang/StringBuilder.toString:()
    Ljava/lang/String;
64: astore_1

This is the byte code for String.concat(), and its clear from this that the String.concat is using StringBuilder for concatenation and the performance should be as good as String Builder. But given that the source object being used is String, we do have some performance loss in String.concat.

So for the simple operations we should use String.concat compared to (+), if we don’t want to create a new instance of StringBuffer/Builder. But for huge operations, we shouldn’t be using the concat operator, as seen in the performance results it will bring down the application to its knees and spike up the CPU utilization. To have the best performance, the clear choice is StringBuilder as long as you do not need thread-safety or synchronization.

Full source of this application is available in my github page.

Serial Key Generation and Validation in Java

In this post, I am going to show how to write a very basic serial key generation module for any Java based application - Same algorithms can be used for any programming language. The module consists of three parts.

WARNING: Hash functions used in this article are obsolete and extremely vulnerable to attacks. For real-world implementations, select a more secure hash algorithm.

  1. Algorithm for Serial Generation
  2. Generating the Serial
  3. Validating the Serial
Algorithm for Serial Generation

The following can be used as a simple algorithm for generating serial keys with 18 digits. In this method we are generating the serials based on the input the user gives, which can be the name of the user, company name etc. So most of the times the serial will be unique (based on the input). We get the name of the user as the input and generate the MD2, MD5 & SHA1 hashes for the string and concatenate together. This will generate a total of 104 digits, since we need only 18 of them, we can use a set of pre defined numbers to select the 18 digits.

The below figure explains the algorithm. I have selected some random numbers to pick the 18 digits from 104 character hash, you can use any number you like which makes the serial unique.

Generating the Serial

To generate the serial, we need a input string and based on the input string we will be generating MD2, MD5 and SHA1 hashes. The method calculateSecurityHash takes the input string and the hashing method as input and generates the hash based on the method.

String serialNumberEncoded = calculateSecurityHash(fullNameString,"MD2") +
    calculateSecurityHash(fullNameString,"MD5") +
        calculateSecurityHash(fullNameString,"SHA1");

Generating the Hash for the input string based on the type.

private String calculateSecurityHash(String stringInput, String algorithmName)
    throws java.security.NoSuchAlgorithmException {
        String hexMessageEncode = "";
        byte[] buffer = stringInput.getBytes();
        java.security.MessageDigest messageDigest =
            java.security.MessageDigest.getInstance(algorithmName);
        messageDigest.update(buffer);
        byte[] messageDigestBytes = messageDigest.digest();
        for (int index=0; index < messageDigestBytes.length ; index ++) {
            int countEncode = messageDigestBytes[index] & 0xff;
            if (Integer.toHexString(countEncode).length() == 1)
                hexMessageEncode = hexMessageEncode + "0";
            hexMessageEncode = hexMessageEncode + Integer.toHexString(countEncode);
        }
        return hexMessageEncode;
    }

Once all the three types of hashes are combined, we will have a total of 104 characters. Since we need only 18 for our serial key we can pick any random 18 digits from the combined hash. We cannot use random number generation to pick up the 18 digits since we need a valid exit strategy for validating the key.

String serialNumber = ""
    + serialNumberEncoded.charAt(32)  + serialNumberEncoded.charAt(76)
    + serialNumberEncoded.charAt(100) + serialNumberEncoded.charAt(50) + "-"
    + serialNumberEncoded.charAt(2)   + serialNumberEncoded.charAt(91)
    + serialNumberEncoded.charAt(73)  + serialNumberEncoded.charAt(72)
    + serialNumberEncoded.charAt(98)  + "-"
    + serialNumberEncoded.charAt(47)  + serialNumberEncoded.charAt(65)
    + serialNumberEncoded.charAt(18)  + serialNumberEncoded.charAt(85) + "-"
    + serialNumberEncoded.charAt(27)  + serialNumberEncoded.charAt(53)
    + serialNumberEncoded.charAt(102) + serialNumberEncoded.charAt(15)
    + serialNumberEncoded.charAt(99);

You can replace the numbers with anything between 0 and 103 so that the key is unique and based on the input string.

Validating the Serial

Now, whenever we get a user name and serial combination, we should be able to validate that. Since we already know the algorithm used to generate the serial, the validation part is pretty easier. We cannot follow the steps which we did while generating the serial in opposite direction because we will end up with a bunch of characters without any valid lead. Since we have the serial and user name to validate, we take the user name and generate the serial for the user name as per our algorithm. Once we have the serial number, we compare this against the serial which we got for validation, if both matches then we have a valid key.

String serialNumberEncoded =
    registrationAppSerialGenerationReversal.calculateSecurityHash(fullNameString,"MD2")
    + registrationAppSerialGenerationReversal.calculateSecurityHash(fullNameString,"MD5")
    + registrationAppSerialGenerationReversal.calculateSecurityHash(fullNameString,"SHA1");

String serialNumberCalc = ""
    + serialNumberEncoded.charAt(32)  + serialNumberEncoded.charAt(76)
    + serialNumberEncoded.charAt(100) + serialNumberEncoded.charAt(50) + "-"
    + serialNumberEncoded.charAt(2)   + serialNumberEncoded.charAt(91)
    + serialNumberEncoded.charAt(73)  + serialNumberEncoded.charAt(72)
    + serialNumberEncoded.charAt(98)  + "-" + serialNumberEncoded.charAt(47)
    + serialNumberEncoded.charAt(65)  + serialNumberEncoded.charAt(18)
    + serialNumberEncoded.charAt(85)  + "-" + serialNumberEncoded.charAt(27)
    + serialNumberEncoded.charAt(53)  + serialNumberEncoded.charAt(102)
    + serialNumberEncoded.charAt(15)  + serialNumberEncoded.charAt(99);

if (serialNumber.equals(serialNumberCalc))
    System.out.println("Serial MATCH");
else
    System.out.println("Serial MIS-MATCH");

Sample output from this program is demonstrated below.