Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Java bytecode optimizer that removes useless gotos?

Problem: I have a method that compiles to over 8000 bytes of Java bytecode. HotSpot has a magic limit that makes the JIT not kick in for methods that exceed 8000 bytes. (Yes, it is reasonable to have a huge method. This is a tokenizer loop.) The method is in a library and I don't want to require users of the library to have to configure HotSpot to deactivate the magic limit.

Observation: Decompiling the bytecode shows that Eclipse Java Compiler generates a lot of pointless gotos. (javac is even worse.) That is, there are gotos that are only reachable from jumps. Obviously, the jump that jumps to the goto should instead jump directly where the goto jumps and the goto should be eliminated.

Question: Is there a bytecode optimizer for Java 5 class files that flattens pointless jump chains and then removes unnecessary gotos?

Edit: I mean patterns like:

8698:   goto    8548
8701:   goto    0

Obviously, the second goto can only be reached by a jump to 8701 which might as well be a direct jump to 0.

On a second investigation, this questionable pattern is more common:

4257:   if_icmpne   4263
4260:   goto    8704
4263:   aload_0

Where obviously, one would like the compiler to reverse the "not equal" comparison to "equal" comparison, jump to 8704 and eliminate the goto.

like image 940
hsivonen Avatar asked Jun 03 '09 12:06

hsivonen


People also ask

What is bytecode manipulation?

Bytecode is the instruction set of the Java Virtual Machine (JVM), and all languages that run on the JVM must eventually compile down to bytecode. Bytecode is manipulated for a variety of reasons: Program analysis: find bugs in your application.

Is bytecode faster?

As compilers/interpreters mature, they become more efficient, so it's laughable to say a bytecode interpreter is faster than a compiler.


1 Answers

I feel your pain. I had to write a parser once that had around 5kloc of if(str.equals(...)) code. I broke into several methods along the lines of parse1, parse2, etc. If parse1 didn't result in a parsed answer, parse2 was called, etc. This isn't necessarily best-practices, but it does do what you need it to.

like image 181
KitsuneYMG Avatar answered Nov 02 '22 23:11

KitsuneYMG