Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the Java source code for a given bytecode position in an automated way?

Tags:

java

bytecode

I am experimenting with a tool that does static analysis. The tool works on bytecode rather than source code. (However, I have the source code as well).

The tool outputs some line numbers from the bytecode and now I need an easy way to map back to source code. Netbeans/Eclipse do this all the time (when I click on a method in an included library, the IDE takes me to the source (if it is available)), so I know this is possible. I just could not figure out a way to do it.

As an example, consider the following hello world program:

package mypackage;
import java.io.*;
class MyMainClass {
  public static void main(String[] args) { 
    BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); 
    String name0 = "Alice";
    String name1 = "Bob";
    try {
        name0 = in.readLine();
    }
    catch(Exception e) {
        System.out.println("Caught an exception!"); 
    }       
    System.out.println("Hello " + name0 + "!"); 
    System.out.println("Hello " + name1 + "!"); 
  }
}

The generated bytecode (obtained from javap) is:

Compiled from "MyMainClass.java"
class mypackage.MyMainClass extends java.lang.Object{
mypackage.MyMainClass();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   new     #2; //class java/io/BufferedReader
   3:   dup
   4:   new     #3; //class java/io/InputStreamReader
   7:   dup
   8:   getstatic       #4; //Field java/lang/System.in:Ljava/io/InputStream;
   11:  invokespecial   #5; //Method java/io/InputStreamReader."<init>":(Ljava/io/InputStream;)V
   14:  invokespecial   #6; //Method java/io/BufferedReader."<init>":(Ljava/io/Reader;)V
   17:  astore_1
   18:  ldc     #7; //String Alice
   20:  astore_2
   21:  ldc     #8; //String Bob
   23:  astore_3
   24:  aload_1
   25:  invokevirtual   #9; //Method java/io/BufferedReader.readLine:()Ljava/lang/String;
   28:  astore_2
   29:  goto    42
   32:  astore  4
   34:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   37:  ldc     #12; //String Caught an exception!
   39:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   42:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   45:  new     #14; //class java/lang/StringBuilder
   48:  dup
   49:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   52:  ldc     #16; //String Hello
   54:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   57:  aload_2
   58:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   61:  ldc     #18; //String !
   63:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   66:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   69:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   72:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   75:  new     #14; //class java/lang/StringBuilder
   78:  dup
   79:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   82:  ldc     #16; //String Hello
   84:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   87:  aload_3
   88:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   91:  ldc     #18; //String !
   93:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   96:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   99:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   102: return
  Exception table:
   from   to  target type
    24    29    32   Class java/lang/Exception


}

The output of the tool is something like:

<mypackage.MyMainClass> 39, 69, 99

These correspond to the bytecode line numbers. Manually I can figure out that the lines must correspond to the following lines in the source code:

System.out.println("Caught an exception!"); 
System.out.println("Hello " + name0 + "!"); 
System.out.println("Hello " + name1 + "!"); 

However, I need to automate this process. Any help would be appreciated.

like image 348
Jus12 Avatar asked May 21 '12 07:05

Jus12


2 Answers

If you have access to both the source file and the appropriate line number, the task should be as simple as loading the file on a per-line basis and merely choosing the one corresponding to that line number.

The problem with your approach is that it relies on optional metadata stored in the compiled code as per the class file format, in the form of attributes. The two in question, namely SourceFile and LineNumberTable are both optional, meaning there is no guarantee they will be present in your code. Verify that the class files you're analyzing are compiled to contain this info!

Note: These same attributes are used for providing the information for stack traces via Throwable.getStackTrace, in case you were wondering.

like image 173
obataku Avatar answered Sep 23 '22 12:09

obataku


Another option to consider is taking advantage of existing static analysis tools, like FindBugs, which have lots of function to show line numbers for flagged sections of code, as well as producing nice HTML reports. You can extend FindBugs with your own bytecode analysis relatively easily. I have followed the tutorial here with success: http://www.ibm.com/developerworks/library/j-findbug2/

like image 26
Holly Cummins Avatar answered Sep 20 '22 12:09

Holly Cummins