Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to really benchmark the memory usage of a Java application

I want to compare different implementations of Java programs in terms of their memory usage efficiency. There are different usage scenarios formulated as JUnit test cases. Actually, all the code is open source at: https://github.com/headissue/cache2k-benchmark

The general wisdom to get to the used memory of a Java program is this: Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(), of course it is also possible to use the JMX interface to get these values.

However, the determined values of used memory is not reliable. Possible reasons:

  • There may be uncollected garbage
  • There is fragmentation, if the GC did no compaction

So far I experimented with switching to serial GC and to force garbage collection with Runtime.getRuntime().gc() before reading out the value. I've put the experimental code for this at: https://github.com/cruftex/java-memory-benchmark

If I do three gc calls before reading the values, I get this output (mvn test | grep loopCount with jdk1.7.0_51):

testBaseline1: used=1084168, loopCount=0, total=124780544
testBaseline2: used=485632, loopCount=0, total=124780544
testBaseline3: used=483760, loopCount=0, total=124780544
testBaseline4: used=483800, loopCount=0, total=124780544
testBaseline: used=484160, loopCount=0, total=124780544
test100MBytes: used=105341496, loopCount=0, total=276828160
test127MBytes: used=133653088, loopCount=0, total=469901312
test27MBytes: used=28795528, loopCount=0, total=317755392
test10MBytes: used=10969776, loopCount=0, total=124784640

With four gc calls (as checked in) I get:

testBaseline1: used=483072, loopCount=0, total=124780544
testBaseline2: used=483728, loopCount=0, total=124780544
testBaseline3: used=483768, loopCount=0, total=124780544
testBaseline4: used=483808, loopCount=0, total=124780544
testBaseline: used=483848, loopCount=0, total=124780544
test100MBytes: used=105341504, loopCount=0, total=276828160
test127MBytes: used=133653096, loopCount=0, total=469901312
test27MBytes: used=28795536, loopCount=0, total=139239424
test10MBytes: used=10969784, loopCount=0, total=124784640

So it is empirically shown, that with four GC calls, the results seem to be correct. From the GC statistics output I can see that the first GC fills the tenured space and the fourth GC call reduces it:

2015-01-08T02:30:35.069+0100: [Full GC2015-01-08T02:30:35.069+0100: [Tenured: 0K->1058K(83968K)
2015-01-08T02:30:35.136+0100: [Full GC2015-01-08T02:30:35.136+0100: [Tenured: 1058K->1058K(83968K)
2015-01-08T02:30:35.198+0100: [Full GC2015-01-08T02:30:35.198+0100: [Tenured: 1058K->1058K(83968K)
2015-01-08T02:30:35.263+0100: [Full GC2015-01-08T02:30:35.264+0100: [Tenured: 1058K->471K(83968K)

The final code, to get the memory usage value is:

try {
  Runtime.getRuntime().gc();
  Thread.sleep(55);
  Runtime.getRuntime().gc();
  Thread.sleep(55);
  Runtime.getRuntime().gc();
  Thread.sleep(55);
  Runtime.getRuntime().gc();
  Thread.sleep(55);
} catch (Exception ignore) { }
long _usedMem;
long _total;
long _total2;
long _count = -1;
// loop to get a stable reading, since memory may be resized between the method calls
do {
  _count++;
  _total = Runtime.getRuntime().totalMemory();
  try {
    Thread.sleep(12);
  } catch (Exception ignore) { }
  long _free = Runtime.getRuntime().freeMemory();
  _total2 = Runtime.getRuntime().totalMemory();
  _usedMem = _total - _free;
} while (_total != _total2);
System.out.println(_testName + ": used=" + _usedMem + ", loopCount=" + _count + ", total=" + _total);

I am pretty unsure about whether this approach is producing reliable results all the time. So some questions:

  • Is there some best practice to get reliable and comparable benchmark values from Java programs?
  • Any ideas how to tune (or actually detune) the GC for that usage case?
  • Is there a reliable source and a reliable behavior explaining the needed four GC calls? (BTW: java 8 is performing the same way)
  • Is there a way to say the JVM: "Do best possible garbage collection, I'll wait"?
  • In general, what might be the most "future proof" and reliable solution for the problem statement?

Update:

Although some questions above are GC related, the actual problem is not. I like to find out the memory usage of an application for a single point in time. A possible solution would also to do a depth search of all object and sum up the sizes.

Update 2:

Meanwhile I did write and extensive blog article about that problem discussing different ways how to measure the actual memory usage:

https://cruftex.net/2017/03/28/The-6-Memory-Metrics-You-Should-Track-in-Your-Java-Benchmarks.html

like image 701
cruftex Avatar asked Jan 08 '15 01:01

cruftex


1 Answers

I also struggled with this issue and interested to know if there is any standard way.

The best I could do was to tell JVM to do its best to gather garbage as much as possible by calling the following method after a run and before the next one:

GcFinalization.awaitFullGc();

This method is from the Guava test-lib package, which can be added as a Maven dependency as:

 <dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava-testlib</artifactId>
    <version>18.0</version>
</dependency>

The implementation looks like this:

public static void awaitFullGc() {
   final CountDownLatch finalizerRan = new CountDownLatch(1);
   WeakReference<Object> ref = new WeakReference<Object>(
      new Object() {
         @Override protected void finalize() { finalizerRan.countDown(); }
      });

   await(finalizerRan);
   awaitClear(ref);

   // Hope to catch some stragglers queued up behind our finalizable object
   System.runFinalization();
 }

This gave me very consistent results for each run and makes the CPU user time (from ThreadMXBean) very close to nano time (from System.currentTimeMills). My main concern in those measurements was running time, but the memory usage was also consistent, compared to the version without this call in between.

like image 177
Wickoo Avatar answered Oct 27 '22 02:10

Wickoo