I heard this question on an interview and couldn't provide an answer. Later I searched thru the internet and still didn't find an answer. Can anybody tell me how JVM stops threads during stop-the-world pause when collecting garbage and how it run them again.
A garbage collection pause, also known as a stop-the-world event, happens when a region of memory is full and the JVM requires space to continue. During a pause all operations are suspended. Because a pause affects networking, the node can appear as down to other nodes in the cluster.
Blocking during background - background GC does not suspend other threads during Gen2 collections. Nevertheless,Gen0 and Gen1 collections (which are an inevitable part of a full GC) still require managed threads to be suspended.
It will run the GC when it realizes that the memory is running low or an object become eligible for GC when no live thread can access it. But this behavior of JVM cannot be guaranteed, one can request the GC to happen from within the java program but there is no guarantee that this request will be taken care by JVM.
Stop the World Event - All minor garbage collections are "Stop the World" events. This means that all application threads are stopped until the operation completes. Minor garbage collections are always Stop the World events. The Old Generation is used to store long surviving objects.
For HotSpot and OpenJDK at least, the JVM uses safe points to stop the application's flow in each thread, either introduced in the JITed code or by changing the bytecode mappings for interpreted code (see this post by Alexey Ragozin for more details).
See also this answer by Gil Tene on why safepointing can be an additional issue when dealing with stop-the-world pauses.
Here are more details (as I understand them, I don't claim to be an expert) on the safepointing mechanism in Hotspot/OpenJDK (see for example safepoint.cpp, line 154), based on the above resources, and probably on some articles by Cliff Click on the Azul Systems blog (which seems to have disappeared from the site).
The JVM needs to get control of the flow from the application, so it depends on the current state of the threads:
Blocked
The JVM already has control of that thread.
Running interpreted code
The interpreter goes into a mode (by swapping its dispatch table) where each bytecode evaluation will be preceded by a safepoint check.
Running native code (JNI)
JNI code runs in a safepoint and can continue running, unless it calls back into Java or calls some specific JVM methods, at which point it may be stopped to prevent leaving the safepoint (thanks Nitsan for the comments).
Running compiled code
The JVM makes a specific memory page (the Safepoint Polling page) unreadable, which makes the periodic reads of that page (inserted into the compiled code by the JIT compiler) fail and go to a JVM handler.
Running in the VM or transitioning states (in the VM also)
The flow goes through the safepoint check at some point anyway, so the VM waits for it.
Once a thread is at the safepoint, controlled by the JVM, the JVM simply blocks it from exiting. When all threads have been stopped (i.e. the world is stopped), the JVM can do the garbage collection then release all the threads which resume execution.
For much more details, you can go read this blog post on safepoints written by Nitsan Wakart in the meantime (which itself has even more references).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With