Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpreting jemaloc data possible off-heap leak

Tags:

java

jvm

I starting my search 2 weeks ago for an ever growing java memory. I am using the following command to prevent the heap from growing too much and also to do some debugging.

I am running on an Ubuntu 16.04 using oracle java 8, as openjdk 8 did not have the debugging symbols I needed to make jemaloc provide correct data

-XX:NativeMemoryTracking=detail -XX:+UseG1GC -XX:+UseStringDeduplication -Xms64m -Xmx256m -XX:MaxMetaspaceSize=128m -Xss256k

As you can see my Xmx is set for 256m. However top currently shows my process to be at 1.1G

After using JProfiler and JVisualVm I and many other things I could find on google I have come to the conclusion that this must be an off-heap problem.

After much searching I came across jemaloc and the articles I read about it seemed promising. But I am having some problems now interpreting this data. And finding out how to pin point the source of my problem.

top memory usage

jemaloc graph

Native Memory Tracking Data

Native Memory Tracking:

Total: reserved=1678MB, committed=498MB
-                 Java Heap (reserved=256MB, committed=256MB)
                            (mmap: reserved=256MB, committed=256MB)

-                     Class (reserved=1103MB, committed=89MB)
                            (classes #14604)
                            (malloc=3MB #32346)
                            (mmap: reserved=1100MB, committed=85MB)

-                    Thread (reserved=26MB, committed=26MB)
                            (thread #53)
                            (stack: reserved=26MB, committed=26MB)

-                      Code (reserved=261MB, committed=96MB)
                            (malloc=17MB #17740)
                            (mmap: reserved=244MB, committed=79MB)

-                        GC (reserved=1MB, committed=1MB)
                            (mmap: reserved=1MB, committed=1MB)

-                  Internal (reserved=6MB, committed=6MB)
                            (malloc=6MB #48332)

-                    Symbol (reserved=19MB, committed=19MB)
                            (malloc=16MB #168491)
                            (arena=4MB #1)

-    Native Memory Tracking (reserved=5MB, committed=5MB)
                            (tracking overhead=4MB)
like image 713
Elroy Avatar asked Dec 02 '18 00:12

Elroy


1 Answers

Check process memory map

Native Memory Tracking accounts only structures of Java Virtual Machine, but it does not count memory-mapped files nor native memory allocated by shared libraries (including the native code of Java Class Library). Furthermore, NMT does not track any internal fragmentation of malloc - standard libc allocator.

First, to analyze off-heap usage of a Java process, look at its full memory map:

pmap -X <pid>

This will shed light on whether the memory is used by mapped files or by anonymous regions.

Change standard allocator

If you see a number of anonymous regions mutiple to 64 MB, this could be a sign of malloc arenas. Libc malloc is known to have issues with excessive virtual memory usage on some systems. Using jemalloc or tcmalloc as a drop-in replacement (even without profiling feature) may become a solution in this case.

Profile native allocations

Unfortunately, jemalloc profiler knows nothing about Java; the graph breaks at the last native function, so the output may look confusing. In your case jemalloc suggests the problem may be related to class loading and System.loadLibrary, but it's hard to tell for sure without a full picture.

Async-profiler allows to trace native allocations in Java context. Run

./profiler.sh -d <duration> -e malloc -f malloc.svg <pid>

This will produce a Flame Graph of malloc calls, e.g.:

Malloc Flame Graph

This is just an example demonstrating how java.util.zip.GZIPOutputStream can be a source of native memory allocations. Your case will be different, of course.

Note that malloc calls themselves do not mean a memory leak. E.g. memory could be allocated and then released shortly afterwards. The graph is just a hint where to look at.

In order to find places where RSS increases, you may want to trace mprotect or mmap calls. This can be done with async-profiler in a similar way:

./profiler.sh -d <duration> -e mprotect -f mprotect.svg <pid>
./profiler.sh -d <duration> -e mmap -f mmap.svg <pid>

Pay attention to agent libraries

I've noticed cbClassPrepare and classTrack_processUnloads functions in your jemalloc graph. This means you are using jdwp debugging agent. It can be definitely a reason for excessive memory allocation - I used to see memory leaks in jdwp before. Any other agent library enabled through -agentlib, -agentpath or -javaagent options is also a suspect, since their native memory usage is not tracked by JVM.

like image 190
apangin Avatar answered Oct 30 '22 19:10

apangin