I was reading some performance related post, and then I encountered the sentence "Java's garbage collector stops all the threads before reclaiming the memory, which is also a performance issue". I tried to find it on Google, but I could not.
Could someone please share something so that I can be clear about this?
The short and not very informative answer is Because it's damn hard not to, so let's elaborate.
There are many built-in collectors in the HotSpot JVM (see https://blogs.oracle.com/jonthecollector/entry/our_collectors). The generational collectors have evolved significantly, but they still cannot achieve full concurrency, as we speak. They can concurrently mark which objects are dead and which are not, they can concurrently sweep the dead objects, but they still cannot concurrently compact the fragmented living objects without stopping your program (*).
This is mostly because it's really, really hard to ensure that you are not breaking someone's program by changing an object's heap location and updating all references to it, without stopping the world and doing all in one clean swoop. It's also hard to ensure that all the living objects that you are moving are not being changed under your nose.
Generational collectors can still run for significant amounts of time without stopping the world and doing the necessary work, but still their algorithms are delaying the inevitable, not guaranteeing fully concurrent GC. Notice how phrases like mostly concurrent (i.e. not always concurrent) are used when describing many GC algorithms.
There are also non-generational collectors like G1GC and they can show awesome results (to the point that G1GC will become the default collector in HotSpot), but they still cannot guarantee that there will be no stop-the-world pauses. Here the problem is again the concurrent compaction, but specifically for G1 this is somewhat less of a problem, because it can concurrently perform some region-based compaction.
To say that this is scraping the surface will be an embellishment - the area is gigantic and I'd recommend you to go over some of the accessible materials on the topic like Gil Tene's Understanding Garbage Collection for some theory behind it or Emad Benjamin's Virtualizing and Tuning Large Scale JVMs for some practical problems and solutions.
(*) This is not an entirely unsolvable problem, though. Azul's Zing JVM with its C4 garbage collector claims fully concurrent collection (there's a whitepaper about it, but you may find the details here more interesting). OpenJDK's Shenandoah project also shows very promising results. Still, as the8472 has explained, you pay some price in throughput and significant price in complexity. The G1GC team considered going for a fully concurrent algorithm, but decided that the benefits of a STW collector outweigh that in the context of G1GC.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With