Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the "stop the world" time reported with PrintGCApplicationStoppedTime really accurate?

I'm trying to estimate how long the threads in a java application are blocked by garbage-collection "stop the world" pauses by using the -XX:+PrintGCApplicationStoppedTime JVM option.

I parse the gc.log file and add the reported stop time to the GC timestamp (-XX:+PrintGCDateStamps) to get "start and stop" time stamps for the GC events.

The application threads measure the time it takes to send a HTTP request and get the response and log the time stamp when the request is sent and the duration of the request.

When I compare the GC event time stamps with the application (DATA) time stamps I get some odd results:

Type|   Start time|Start time diff|     End time|End time diff|  Duration
DATA|1330360259830|            230|1330360260139|            7|       309
DATA|1330360259849|            211|1330360260138|            8|       289
DATA|1330360259960|            100|1330360260135|           11|       175
DATA|1330360259979|             81|1330360260135|           11|       156
DATA|1330360259980|             80|1330360260135|           11|       155
DATA|1330360259989|             71|1330360260135|           11|       146
DATA|1330360260019|             41|1330360260135|           11|       116
DATA|1330360260029|             31|1330360260135|           11|       106
  GC|1330360260060|              0|1330360260146|            0|        86

The "odd" part above is that most of the "DATA" entries ended 11 ms before the GC event ended. My interpretation is that it took about 11 ms from the time the application threads were "released" from the safe point until the "stop the world" time was calculated by the GC.
Is this interpretation correct?

The code in the method SafepointSynchronize::end() in hotspot/src/share/vm/runtime/safepoint.cpp from openjdk seems to wake up suspended java threads before the stop time is calculated.
That means that the reported stop time is unreliable, the thread that calculates the stop time might become suspended between waking the java threads and calculating the time, right?
(I'm not a C++ programmer so I might have misinterpreted the code.)

The application was run with Oracle JDK 1.6.0_20 on a Solaris 10 5/08 x86 server.

The command line:

java -Xss128k -Xms1024m -Xmx1024m -XX:MaxPermSize=28m -Xloggc:logs/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -classpath [...] net.grinder.engine.process.WorkerProcessEntryPoint

like image 842
ola Avatar asked Mar 08 '12 17:03

ola


1 Answers

these flags are incorrectly named and you're correct that they were unreliable. The bug report you're after is this one and it was fixed in hs17 which first hit production in 6u21 (as per this fix list) so you're 1 version too old.

Also note that you're using the data incorrectly anyway. The stoppedtime flag is accompanied by a PrintGCApplicationConcurrentTime flag and what these two flags are printing is time spent at a safepoint, you can use PrintSafepointStatistics for information on what the safepoints are for.

This answer is on a similar subject.

like image 103
Matt Avatar answered Sep 27 '22 20:09

Matt