In Java 8 Update 45, adding these options to the java
call:
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
shows me statistics like these:
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
3679.229: no vm operation [ 72 1 2 ] [ 6016 0 6016 0 0 ] 1
2015-05-22T11:25:27.519+0200: Total time for which application threads were stopped: 6.0168551 seconds, Stopping threads took: 6.0164099 seconds
The problem here is the long time for Stopping threads
. In this example, it is 6 seconds which is already a problem for our application, but I have seen even larger times, in one instance (without full logging, though) amounting to almost a minute.
The VM operation (here: no vm operation
) is varying. I have also seen, e.g. RevokeBias
, G1IncCollectionPause
, or GCG_Operation
. Also, the page_trap_count
seems to be irrelevant. I have seen examples where it was 0, and others where it was 2. Consistent, though, is that the time always is reflected in the values of spin
and sync
.
I am looking for an in-depth explanation of those timing values spin
and sync
, but mostly I am interested in why this is happening and what I can do against it. I am not aware of anything 'evil' in our configuration. There are plenty of bored cores and unused memory on the machine, we are running pure Java (no JNI), and we are not aware of any excessive synchronization in our code.
The problem here is that it takes your application a long time to reach a safepoint. The Stopping threads
output denotes the time it takes between the JVM issues a safepoint request until all threads have reached a safepoint.
The sync
value shows same thing - it is the time it takes for all threads to reach the safeponit.
The spin
and block
values denote the times it takes for blocked
and spinning
(executing code) threads to reach the safepoint.
Knowing this we can conclude that the problem for you is that one thread is busy spinning and not able to reach its safepoint in several seconds.
Exactly why this occurs is hard to say. An example, as illustrated in this question and it's answer is that the JIT compiler can compile heavy loops without safepoint checks.
You could try running your JVM with the options -XX:+SafepointTimeout -XX:SafepointTimeoutDelay=500
. That will timeout the safepoint syncs after 500 ms and print information about the thread(s) that failed to reach the safepoint.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With