Java Bytecode Manipulation

In this article, I will show how to manipulate a compiled class file directly without decompiling it to java.

I will be using Javassist (Java Programming Assistant), an external library for most of this tutorial. Download latest JAR file to get examples work. I am using version rel_3_22_0_cr1-4-g6a3ed31.

Every java file compiled will generate a class file which is a binary file containing Java bytecode which can be executed on any Java Virtual Machine. Since the class files are generally not dependent on the platform they are compiled on, it makes Java applications platform independent. In this article, we will explore how to statically analyze class files, modify them programmatically and execute.

Sample Class for Bytecode Manipulation

We will start with a simple test class (ByteCodeEditorTest) which we will use to modify using Javassist. This class file will get an input from user and check if it matches a predefined value within code and output message accordingly.

public String checkStatus(String _inputString){
    if (_inputString.equals("MAGIC"))
        return "Right!";
    return "Wrong";
}

Once compiled, and executed below is a sample behaviour of the class. We will modify compiled class file directly to change its behaviour by modifying equality operator.

$ java ByteCodeEditorTest TEST
Wrong
$ java ByteCodeEditorTest MAGIC
Right!

Let’s start by looking at the compiled class file using javap. I have provided snippet of checkStatus() method from test class.

$ javap -c ByteCodeEditorTest
Compiled from "ByteCodeEditorTest.java"
  public java.lang.String checkStatus(java.lang.String);
    Code:
       0: aload_1
       1: ldc           #7      // String MAGIC
       3: invokevirtual #8      // Method java/lang/String.equals:(Ljava/lang/Object;)Z
       6: ifeq          12
       9: ldc           #9      // String Right!
      11: areturn
      12: ldc           #10     // String Wrong
      14: areturn
}

The disassembled code contains mnemonic for Java bytecode instructions. We will be heavily using these as a part of bytecode manipulation. Refer to Java bytecode instruction listings Wikipedia article which contains all mnemonic and Opcode for Java bytecode.

Interesting line is on index 6 from disassembled code which contains mnemonic ifeq which compares input string against built in value. Let’s use Javassist to modify equality operator from ifeq to ifne.

Bytecode Manipulation using Javassist

Now that we have our test class and details on what has to be modified in bytecode, let’s create a new class file which loads compiled ByteCodeEditorTest class for manipulation. With Javassist JAR in classpath, let’s load the test class file using javassist.CtClass.

ClassPool _classPool = ClassPool.getDefault();
CtClass _ctClass = _classPool.makeClass(new FileInputStream("ByteCodeEditorTest.class"));

Once ByteCodeEditorTest class is loaded, we will use javassist.CtMethod to extract all the methods from class and then use javassist.bytecode.CodeAttribute & javassist.bytecode.CodeIterator to manipulate the class.

CodeIterator allows us to traverse every bytecode instruction from class file and also provides methods to manipulate them. In our case, from the javap output we know index 6 has to modified to change instruction set from ifeq to ifne. Looking at Opcode reference, hex value for ifne is 9a. We will be using decimal format to update bytecode using CodeIterator.

So we will be using CodeIterator.writeByte() method to update index 6 of ByteCodeEditorTest from exising value to 154 (9a converted to decimal). Below table shows existing value (row1) and new value (row2)

Mnemonic Opcode (Hex) Opcode (Decimal)
ifeq 0x99 153
ifne 0x9a 154
for(CtMethod _ctMethods:_ctClass.getDeclaredMethods()){
    CodeAttribute _codeAttribute = _ctMethods.getMethodInfo().getCodeAttribute();
    CodeIterator _codeIterator = _codeAttribute.iterator();
    while (_codeIterator.hasNext()) {
        int _indexOfCode = _codeIterator.next();
        int _valueOfIndex8Bit = _codeIterator.byteAt(_indexOfCode);
        //Checking index 6 and if Opcode is ifeq
        if(_valueOfIndex8Bit==153 && _indexOfCode==6) {
            //Changing instruction from ifeq to ifne
            _codeIterator.writeByte(154, _indexOfCode);
        }
    }
}
//Write changes to class file
_ctClass.writeFile();

Once this code is run, ByteCodeEditorTest class file will be modified with updated instructions. When running javap on ByteCodeEditorTest now, it will produce below result of checkStatus() method.

$ javap -c ByteCodeEditorTest
Compiled from "ByteCodeEditorTest.java"
  public java.lang.String checkStatus(java.lang.String);
    Code:
       0: aload_1
       1: ldc           #7      // String MAGIC
       3: invokevirtual #8      // Method java/lang/String.equals:(Ljava/lang/Object;)Z
       6: ifne          12
       9: ldc           #9      // String Right!
      11: areturn
      12: ldc           #10     // String Wrong
      14: areturn
}

As you can see, index 6 is now changed to ifne. Running ByteCodeEditorTest now will produce results which we were after.

$ java ByteCodeEditorTest TEST
Right!

ByteCodeEditorTest class file was successfully modified to alter program flow without the need for re-compilation or decompilation.

While this is a simple modification to a class file, we can do complex changes of adding new methods, classes, injecting code etc. using Javassist library. I will cover complex scenarios in another article, but will give a high level overview of frequently used in APIs in next section.

Other Javassist APIs

While I covered bytecode manipulation, Javassist is a powerful library which can be used for complex changes. Highlighting some of those features here.

javassist.CtMethod class can be used to inject new methods to existing class files.

//Defrosts so that the class can be modified
_ctClass.defrost();
CtMethod _ctMethod = CtNewMethod.make("public int newMethodFromJA() { return 1; }", _ctClass);
_ctClass.writeFile();

javassist.CtMethod class can also be used to inject code to existing class/methods using insertBefore(), insertAfter() and insertAt() methods.

for(CtMethod method:_ctClass.getDeclaredMethods()){
    //Defrosts so that the class can be modified
    _ctClass.defrost();
    method.insertBefore("System.out.println(\"Before every method call....\");");
    _ctClass.writeFile();
}

Javassist can also be used for static analysis of class files by displaying all method code (disassembled) of a class file or to display bytecode of a class file.

//Display Method Code
PrintStream _printStream = new PrintStream(System.out);
InstructionPrinter instructionPrinter = new InstructionPrinter(_printStream);
for(CtMethod method:_ctClass.getDeclaredMethods()){
    System.out.println("Method: " + method.getName());
    instructionPrinter.print(method);
}
//Display Bytecode
for(CtMethod _ctMethods:_ctClass.getDeclaredMethods()){
    _ctClass.defrost();
    System.out.println("Method: " +_ctMethods.getName());
    CodeAttribute _codeAttribute = _ctMethods.getMethodInfo().getCodeAttribute();
    CodeIterator _codeIterator = _codeAttribute.iterator();
    while (_codeIterator.hasNext()) {
        int _indexOfInstruction = _codeIterator.next();
        int _indexValue8Bit = _codeIterator.byteAt(_indexOfInstruction);
        System.out.println(Mnemonic.OPCODE[_indexValue8Bit]);
    }
}

Full source code for all snippets referenced in this article is available in my github page.

Dynamically Load Compiled Java Class as a Byte Array and Execute

As we know, all the compiled java classes runs inside the JVM. The default class loader from Sun loads the classes into JVM and executes it. This class loader is a part of JVM which loads the compiled byte code to memory. In this article, I will show how to convert a compiled java class to a array of bytes and then load these array of bytes into another class (which can be over the network) and execute the array of bytes.

So the question arises, why should we write a custom class loader ? There are some distinct advantages. Some of them below

  • We can load a class over any network protocol. Since the java class can be converted to a series of numbers (array of bytes), we can use most of the protocols.
  • Load Dynamic classes based on the type of user, especially useful when you want to validate the license of your software over the web and if you are paranoid about the security.
  • More flexible and secure, you can encrypt the byte stream (asymmetric or symmetric) ensuring safer delivery.

For this article we will be creating three classes

  1. JavaClassLoader – The custom class loader which will load the array of bytes and execute. In other words, the client program.
  2. Class2Byte – The Java class which converts any compiled class / object to a array of bytes
  3. ClassLoaderInput – The class which will be converted to array of bytes and transferred

Let’s divide this article into two sections, in the fist section we will convert the java class to array of bytes and in the second section, we will load that array.

Create & Convert the Java class to array of bytes

Let’s write a simple class (ClassLoaderInput) which just prints a line. This is the class which will be converted to a byte array.

public class ClassLoaderInput {
    public void printString() {
        System.out.println("Hello World!");
    }
}

Now, let’s write another class (Class2Byte) which will convert the ClassLoaderInput to a byte of array. The concept to convert the file is simple, compile the above file and load the class file through input stream and with an offset read and convert the class to bytes and write the output in to another out stream. We need these bytes as a comma separated value, so we will use StringBuffer to add comma between the bytes.

int _offset=0;
int _read=0;

File fileName = new File(args [0]);
InputStream fileInputStream = new FileInputStream(fileName);
FileOutputStream fileOutputStream = new FileOutputStream(args[1]);
PrintStream printStream = new PrintStream(fileOutputStream);
StringBuffer bytesStringBuffer = new StringBuffer();

byte[] byteArray = new byte[(int)fileName.length()];
while (_offset < byteArray.length &&
    (_read=fileInputStream.read(byteArray, _offset,
    byteArray.length-_offset)) >= 0)
        _offset += _read;

fileInputStream.close();
for (int index = 0; index < byteArray.length; index++)
    bytesStringBuffer.append(byteArray[index]+",");

printStream.print(bytesStringBuffer.length()==0 ? "" :
    bytesStringBuffer.substring(0, bytesStringBuffer.length()-1));

Now let’s run this file and generate the output. A sample output from my machine is below.

Now,we have the sample class (ClassLoaderInput) file as a bunch of numbers. Now this bunch of numbers can be transferred over any protocol to our custom class loader which will “reconstruct” the class from these bytes and run it, without any physical trace in the client machine (the array of bytes will be on memory).

Load the array of bytes and execute

Now, to the important part of this article, we are going to write a custom class loader which will load those bunch of numbers (array) and execute them. The array of bytes can be transferred over the network but in this example, we will define it as a string in the class loader for demonstration purpose.

Let’s start by defining the array of bytes.

private int[] data = {-54,-2,-70,-66,0,0,0,51,0,31,10,0,6,0,17,9,0,18,0,19,8,
    0,20,10,0,21,0,22,7,0,23,7,0,24,1,0,6,60,105,110,105,116,62,1,0,3,40,41,86,1,
    0,4,67,111,100,101,1,0,15,76,105,110,101,78,117,109,98,101,114,84,97,98,108,
    101,1,0,18,76,111,99,97,108,86,97,114,105,97,98,108,101,84,97,98,108,101,1,0,
    4,116,104,105,115,1,0,18,76,67,108,97,115,115,76,111,97,100,101,114,73,110,
    112,117,116,59,1,0,11,112,114,105,110,116,83,116,114,105,110,103,1,0,10,83,
    111,117,114,99,101,70,105,108,101,1,0,21,67,108,97,115,115,76,111,97,100,101,
    114,73,110,112,117,116,46,106,97,118,97,12,0,7,0,8,7,0,25,12,0,26,0,27,1,0,
    12,72,101,108,108,111,32,87,111,114,108,100,33,7,0,28,12,0,29,0,30,1,0,16,67,
    108,97,115,115,76,111,97,100,101,114,73,110,112,117,116,1,0,16,106,97,118,97,
    47,108,97,110,103,47,79,98,106,101,99,116,1,0,16,106,97,118,97,47,108,97,110,
    103,47,83,121,115,116,101,109,1,0,3,111,117,116,1,0,21,76,106,97,118,97,47,105,
    111,47,80,114,105,110,116,83,116,114,101,97,109,59,1,0,19,106,97,118,97,47,105,
    111,47,80,114,105,110,116,83,116,114,101,97,109,1,0,7,112,114,105,110,116,108,
    110,1,0,21,40,76,106,97,118,97,47,108,97,110,103,47,83,116,114,105,110,103,59,
    41,86,0,33,0,5,0,6,0,0,0,0,0,2,0,1,0,7,0,8,0,1,0,9,0,0,0,47,0,1,0,1,0,0,0,5,42,
    -73,0,1,-79,0,0,0,2,0,10,0,0,0,6,0,1,0,0,0,1,0,11,0,0,0,12,0,1,0,0,0,5,0,12,0,
    13,0,0,0,1,0,14,0,8,0,1,0,9,0,0,0,55,0,2,0,1,0,0,0,9,-78,0,2,18,3,-74,0,4,-79,
    0,0,0,2,0,10,0,0,0,10,0,2,0,0,0,3,0,8,0,4,0,11,0,0,0,12,0,1,0,0,0,9,0,12,0,13,
    0,0,0,1,0,15,0,0,0,2,0,16};

The conversion of these bytes to class is done by the ClassLoader.defineClass() method We should supply the stream of bytes that make up the class data. The bytes in positions off through off+len-1 should have the format of a valid class file as defined by the Java Virtual Machine Specification. The offset and length will be the additional parameters. Once the defineClass converts the array to class, then we can use reflection to execute the methods in the class.

JavaClassLoader _classLoader = new JavaClassLoader();
byte[] rawBytes = new byte[_classLoader.data.length];
for (int index = 0; index < rawBytes.length; index++)
    rawBytes[index] = (byte) _classLoader.data[index];
Class regeneratedClass = _classLoader.defineClass(args[0],
    rawBytes, 0, rawBytes.length);
regeneratedClass.getMethod(args[1], null).invoke(null, new Object[] { args });

Now, let’s compile the class loader and run. The the class file name & method name should be passed as a run time argument. If you have done everything right, you should see the output from the input class which we created (ClassLoaderInput) initially. Sample output from my machine below.

Full source of this application is available in my github page.