Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JVM JIT diagnostic tools and optimization tips

I hear a lot about what JVM JITs can do, but don't see a lot of information on how to profile what the JIT is actually doing in a given run of your program. There are lots of tips about using -XX:+PrintCompilation and -XX:+PrintOptoAssembly but it results in really low-level information that is hard to interpret.

In general, during optimization, I like to have a benchmark suite of common operations with dedicated JIT warmup time and so on, but I'd like to be able to see which optimizations are actually firing on my code. Perhaps my JVM considered inlining a particular method call but something about it made it decide not to, or perhaps the JIT was unable to avoid array bounds checks in my loops because I phrased my invariants and looping conditions too obscurely. I'd expect a tool like YourKit to support some form of "what is going on with the JIT" but I haven't been able to find support for that in YourKit or anywhere else.

Ideally I'd just like a brain dump of what the JIT's optimizer is thinking during a run of my program. Say I've warmed up my function plenty and it decided to inline three methods into my inner loop and broke the loop up into three sections with no array bounds checks on the middle section, I'd like a summary of those decisions and the motivation for them.

Am I missing something obvious here? What do JVM performance-aware programmers do when optimizing tight inner loops to figure out what is going on? Surely the low-level -XX flags can't be the only option, can they? I'd appreciate hints on how best to deal with this sort of low-level stuff on the JVM. And no, this question is not motivated by premature optimization! :)

Edit: I guess some of what I want is given by -XX:+LogCompilation but I'm still curious if people have general tips and tools for this kind of activity.

like image 797
copumpkin Avatar asked Apr 28 '13 02:04

copumpkin


People also ask

How does JIT optimize code?

To help the JIT compiler analyze the method, its bytecodes are first reformulated in an internal representation called trees, which resembles machine code more closely than bytecodes. Analysis and optimizations are then performed on the trees of the method. At the end, the trees are translated into native code.

How does JIT compiler improve performance?

The JIT compiler helps improve the performance of Java programs by compiling bytecodes into native machine code at run time. The JIT compiler is enabled by default. When a method has been compiled, the JVM calls the compiled code of that method directly instead of interpreting it.

Is JVM a JIT?

Although the JIT is not actually part of the JVM standard, it is, nonetheless, an essential component of Java. In theory, the JIT comes into use whenever a Java method is called, and it compiles the bytecode of that method into native machine code, thereby compiling it “just in time” to execute.

Does JVM optimize code?

The JVMs JIT compiler is one of the fascinating mechanisms on the Java platform. It optimizes your code for performance, without giving away its readability. Not only that, beyond the “static” optimization methods of inlining, it also makes decisions based on the way that the code performs in practice.


1 Answers

If you want a brain dump, you can print the resulting assembly code, but this is much lower level than what you have already. I suspect what you are looking for doesn't exist for the HotSpot JVM. I saw a presentation for something like this based on JRockit and perhaps this will make it into HotSpot one day.

Am I missing something obvious here? What do JVM performance-aware programmers do when optimizing tight inner loops to figure out what is going on?

Usually, I like to minimise garbage production and this usually performs well enough. e.g for micro-seconds latencies.

This sort of micro-optimisation really requires a deep understand of machine code and how CPUs really work.

Surely the low-level -XX flags can't be the only option, can they?

If only it where that simple, it is far more complicated. To dump the machine code you need an additional native library which doesn't ship with the JVM. ;)

I'd appreciate hints on how best to deal with this sort of low-level stuff on the JVM.

It appears you don't really want to work at the low level if you can avoid it and I believe this is a good thing, you have to take care of the high level first because micro-optimisation is good for micro-benchmarks but rarely good for real applications because you need to understand all the latencies of your end to end system and this you can do without even looking at the code in many cases. i.e. is the main delay in your database, OS, disk, or network IO.

I'm still curious if people have general tips and tools for this kind of activity.

Use a profiler, and if you suspect you need to go lower, it is quite likely you have missed something far more important.

like image 80
Peter Lawrey Avatar answered Oct 14 '22 05:10

Peter Lawrey