Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tool for analyzing large Java heap dumps

Tags:

java

profiling

I have a HotSpot JVM heap dump that I would like to analyze. The VM ran with -Xmx31g, and the heap dump file is 48 GB large.

  • I won't even try jhat, as it requires about five times the heap memory (that would be 240 GB in my case) and is awfully slow.
  • Eclipse MAT crashes with an ArrayIndexOutOfBoundsException after analyzing the heap dump for several hours.

What other tools are available for that task? A suite of command line tools would be best, consisting of one program that transforms the heap dump into efficient data structures for analysis, combined with several other tools that work on the pre-structured data.

like image 895
Roland Illig Avatar asked Aug 31 '11 07:08

Roland Illig


People also ask

How do you Analyse a Java heap dump?

Eclipse Memory Analyzer Tool ( MAT ) is used for analyzing heap dump files ( see Capturing heap dumps before FullGCs to troubleshoot memory problems ) which contain objects in memory. Each heap dump file can be thought of as a snapshot in time and details the memory occupied by specific JVM threads.

Which tool is specially designed for analyzing Java heap dumps?

The recommended tool is IBM Monitoring and Diagnostic Tools for Java - Memory Analyzer. Note that heap dumps are platform agnostic and can be analysed on any platform regardless of where they were created.

How do you Analyse a memory leak for heap dump?

A dump can be taken on demand (using the jmap JDK utility) or when an app fails with OutOfMemoryError (if the JVM was started with the -XX:+HeapDumpOnOutOfMemoryError command line option). A heap dump is a binary file of about the size of your JVM's heap, so it can only be read and analyzed with special tools.


1 Answers

Normally, what I use is ParseHeapDump.sh included within Eclipse Memory Analyzer and described here, and I do that onto one our more beefed up servers (download and copy over the linux .zip distro, unzip there). The shell script needs less resources than parsing the heap from the GUI, plus you can run it on your beefy server with more resources (you can allocate more resources by adding something like -vmargs -Xmx40g -XX:-UseGCOverheadLimit to the end of the last line of the script. For instance, the last line of that file might look like this after modification

./MemoryAnalyzer -consolelog -application org.eclipse.mat.api.parse "$@" -vmargs -Xmx40g -XX:-UseGCOverheadLimit 

Run it like ./path/to/ParseHeapDump.sh ../today_heap_dump/jvm.hprof

After that succeeds, it creates a number of "index" files next to the .hprof file.

After creating the indices, I try to generate reports from that and scp those reports to my local machines and try to see if I can find the culprit just by that (not just the reports, not the indices). Here's a tutorial on creating the reports.

Example report:

./ParseHeapDump.sh ../today_heap_dump/jvm.hprof org.eclipse.mat.api:suspects 

Other report options:

org.eclipse.mat.api:overview and org.eclipse.mat.api:top_components

If those reports are not enough and if I need some more digging (i.e. let's say via oql), I scp the indices as well as hprof file to my local machine, and then open the heap dump (with the indices in the same directory as the heap dump) with my Eclipse MAT GUI. From there, it does not need too much memory to run.

EDIT: I just liked to add two notes :

  • As far as I know, only the generation of the indices is the memory intensive part of Eclipse MAT. After you have the indices, most of your processing from Eclipse MAT would not need that much memory.
  • Doing this on a shell script means I can do it on a headless server (and I normally do it on a headless server as well, because they're normally the most powerful ones). And if you have a server that can generate a heap dump of that size, chances are, you have another server out there that can process that much of a heap dump as well.
like image 148
Franz See Avatar answered Sep 23 '22 13:09

Franz See