Garbage collection involves walking through a list of allocated objects (either all objects or objects in a particular generation) and determining which are reachable.
How is this list maintained? Do runtimes for GC languages keep a giant list of all objects?
Also, from what I understand, GC involves walking the call stack to look for object references - how does the algorithm distinguish between GC-able pointers and primitive data?
When Java programs run on the JVM, objects are created on the heap, which is a portion of memory dedicated to the program. Eventually, some objects will no longer be needed. The garbage collector finds these unused objects and deletes them to free up memory.
Compacting garbage collectors typically use an algorithm like mark-sweep, but also re-arrange the objects to coalesce free-space to avoid fragmentation. This also often has the benefit of keeping the objects in memory ordered by their allocation time, which typically improves the locality of reference.
7. What happens to the thread when garbage collection kicks off? Explanation: The thread is paused when garbage collection runs which slows the application performance.
Real-time garbage collection as outlined in the paper below is able to garbage collect unused memory with a fixed, short (milliseconds) bound on the amount of time the program is interrupted at any one time, and the percentage of time the program is interrupted in aggregate by the garbage collector.
The memory management system keeps track of the size of each allocated object, just like it does in C or C++. One way this is commonly done is for the memory management system to allocate an extra size_t
before each allocation, that keeps track of the size of each objecct. The memory manager likewise has to keep track of the size of each free block, so that it can reuse blocks to allocate them.
The garbage collector works in two phases: the mark phase, and the sweep phase. In the mark phase, the garbage collector starts walks object references in order to find objects that are still reachable. The garbage collector starts at a few basic places where the object references are stored and given names (the stack, and global storage, and static storage), and then traverses references in the objects.
In the sweep phase, the garbage collector walks the heap from bottom to top, jumping from allocation to allocation based on those size_t
s, and frees anything that isn't marked.
Some languages (like Ruby) tag all of the primitives so that they can be identified separately from the object references at runtime. Other garbage collectors are ver conservative and follow primatives as through they were object references (though some checks must be performed to make sure that the garbage collector doesn't stick a mark in the middle of some other object). Still other languages use runtime type information to be more precise about whether they follow primatives.
Ruby's garbage collector sometimes called "conservative" because it doesn't check whether the space on the stack is actually being used, so it sometimes keeps dead objects alive by following ghost references on the stack. But since it always knows exactly whether the data it's looking at is a reference or a primative, I don't call it conservative here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With