Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do garbage collectors freeze execution?

I was thinking about garbage collection on the way home, and I began wondering, why does the garbage collector totally freeze execution of a program? Personally I would have designed it to block any threads which try to allocate a new object, but threads which were running would be left alone. I can't imagine any situation where this would be a problem compared to how a garbage collector currently works.

like image 359
Martin Avatar asked Nov 30 '09 17:11

Martin


People also ask

What is garbage collector pause?

A garbage collection pause, also known as a stop-the-world event, happens when a region of memory is full and the JVM requires space to continue. During a pause all operations are suspended. Because a pause affects networking, the node can appear as down to other nodes in the cluster.

Why is garbage collector slow?

One of the most misunderstood parts of garbage collection is that it doesn't actually collect dead or unused objects; it collects used ones, the so-called survivors. Garbage collection is slow if most objects survive the collection process.

Can we force the garbage collector to run at any time?

Running the Garbage CollectorYou can ask the garbage collector to run at any time by calling System 's gc method: System. gc(); You might want to run the garbage collector to ensure that it runs at the best time for your program rather than when it's most convenient for the runtime system to run it.


2 Answers

I was thinking about garbage collection on the way home, and I began wondering, why does the garbage collector totally freeze execution of a program?

There is a trade-off between latency and throughput in GC design. You can either process heap-allocated blocks individually ("incremental") or you can batch them up and process them all at the same time ("stop the world"). Fully incremental collection never totally freezes a program and it has very low latency but it also has very poor throughput. Stop the world garbage collectors have the worst possible latency (freezing the program for seconds or even minutes at a time) but near-optimal throughput.

All of the major production GCs today provide a middle ground, typically with generational collection with the per-thread nursery generations collected in batches and incremental or concurrent collection of the shared old generation. Thus, only nursery collections incur pauses and nursery size is bounded so pause times are kept low, e.g. 10-100ms in .NET with the workstation GC.

For a simple GC algorithm that never pauses, see Baker's Treadmill. For more information on garbage collection I highly recommend the Memory Management Reference and the Garbage Collection Handbook.

There is a lot of misinformation in the other answers here. Jon Skeet wrote some source code and started discussing it from the point of view of garbage collection. You need to be very careful doing this because there is little correspondence between source code and what the GC sees. The compiler does instruction block rearrangements, register allocation, promotion and so on, all of which affect what is visible to the GC at run time. In particular, scope in source code is not carried through to compiled code and is typically replaced with the related concept of liveness. Jon also wrote that you must pause in order to get the global roots. That is not strictly true although it is the most efficient way to get the global roots and the resulting pause is almost always tiny (sub-millisecond) because you're just copying less than a kB of stack from each thread.

Powerlord wrote that moving collectors must block reads and, therefore, all threads that read. This is also not true. The simplest counter example is immutable data: referential transparency means you can read from any copy safely.

Kico wrote that pauses are required to determine reachability. This is also not true. See Dijkstra's research about "on-the-fly" collectors and any recent real-time GC such as Stacatto.

Jerry Coffin wrote the best answer but moving isn't the reason GCs pause. There are GCs that don't move but do pause (e.g. HLVM's) and those that do move but don't pause (e.g. Stacatto).

like image 196
J D Avatar answered Sep 29 '22 17:09

J D


Modern garbage collectors (in .NET and Java, anyway) don't actually "stop the world" - they do all kinds of clever things to collect concurrently.

However, you might want to consider a situation like this:

 object x = null;
 object y = new object();
 ...
 x = y;
 y = null;

Now, suppose the GC looks at x, then the lines below the ... run, and then the GC looks at y - it won't have seen any live objects... but the object should still be live.

Basically there needs to be a certain amount of pausing in order to get a consistent set of references. Then there's compaction, reference reassignment etc. However, it isn't nearly as bad as it used to be in terms of requiring everything to be stopped for the whole of the GC cycle. It does, however, get painful to think about :)

like image 20
Jon Skeet Avatar answered Sep 29 '22 18:09

Jon Skeet