How does heap compaction work quickly?

Tags:

They say that compacting garbage collectors are faster than traditional memory management because they only have to collect live objects, and by rearranging them in memory so everything's in one contiguous block, you end up with no heap fragmentation. But how can that be done quickly? It seems to me that that's equivalent to the bin-packing problem, which is NP-hard and can't be completed in a reasonable amount of time on a large dataset within the current limits of our knowledge about computation. What am I missing?

678

asked Apr 18 '10 18:04

Mason Wheeler

1 Answers

Compaction means moving objects in RAM so that some objects are removed (the dead objects, that the GC is supposed to reclaim) and all remaining objects become contiguous in RAM.

Most compacting GC work by allocating objects within a single, contiguous area obtained from the operating system. Compaction is then like removing the dead objects, and then pushing all remaining live objects "to the left", squeezing out the holes. If the GC works by compaction, then allocation is simply a matter of moving up the "end of allocated area" pointer. Synthetically, within the allocation area, there is a pointer such that the free area consists of the bytes after that pointer. To allocate space for an object, the pointer is simply moved up by the new object size. Occasionally, the GC decides that it is time to run, detects the dead object, squeezes the holes out, and thus lowers the allocation pointer.

The performance gains from a compacting GC come from several sources:

For allocation, there is no need to find a suitable "hole": by construction, free space is, at all times, a single, big area, and it suffices to just move a pointer up.
There is no fragmentation: the compaction moves all live objects together, but this can be seen as moving all holes together into a single big free space.
Locality is improved. By moving live objects into adjacent areas, behaviour with regards to cache memory is improved. In particular, compaction algorithms tend to keep objects in their respective allocation order (objects are slided but not swapped) which has been reported to be heuristically good for most applications.

If the operating system refuses to give a single allocation area, instead yielding several blocks, then things become a bit more complex, and could begin to look like the bin-packing problem, because the compacting GC then has to decide in which block goes each live object. However, the complexity of bin-packing is about finding the "perfect" match in the general case; an approximate solution is already good enough for a memory allocator.

The algorithmic difficulty in a compaction algorithm is about updating all pointers, so that they point to the new object location. Through strict typing, the .NET virtual machine can unambiguously decide whether each word in RAM is a pointer or not, but updating all pointers efficiently without using too much extra RAM can be tricky. H.B.M. Jonkers has described a wonderfully smart algorithm for that, in "A Fast Garbage Compaction Algorithm" (Information Processing Letters, Volume 9, Number 1, 1979, pp. 26-30). I could not find a copy of that paper on the Vast Internet, but the algorithm is described in the "Garbage Collection" book by Jones and Lins (section 5.6). I warmly recommend that book to anyone who is interested in understanding garbage collectors. Jonkers' algorithm requires two linear passes on the live objects and turns out to be easy to implement (a few dozen lines of code, no more; the most difficult part is understanding why it works).

Additional complexity comes from generational collectors which try, most of the time, to leave most of the objects untouched, working preferentially with young objects only. Here, this means compacting only the end of the heap; full compaction being applied only rarely. The point here is that a full compaction, although linear, could still induce a noticeable pause. Generational GC tries to make such pauses rarer. There again, Jones' and Lins' book is a must-read.

174

answered Oct 10 '22 14:10

Thomas Pornin

Related questions
                            
                                C# implementation of Google's 'Encoded Polyline Algorithm'
                            
                                Finding if a Binary Tree is a Binary Search Tree [duplicate]
                            
                                Building bridges problem - how to apply longest increasing subsequence?
                            
                                How can I generate truly (not pseudo) random numbers with C#?
                            
                                What is the fastest way to find Nth biggest number of an INT array?
                            
                                Algorithm to find k smallest numbers in array of n items
                            
                                All factors of a given number
                            
                                Changing integer to binary string of digits
                            
                                How to calculate a standard deviation [array] [duplicate]
                            
                                Given a list of numbers and a number k, return whether any two numbers from the list add up to k
                            
                                Help me understand Inorder Traversal without using recursion
                            
                                How to change a negative number to zero in python without using decision structures
                            
                                Boyer Moore Algorithm Understanding and Example?
                            
                                How to efficiently rack up billiards for an 8-ball game?
                            
                                Is there an efficient algorithm for segmentation of handwritten text?
                            
                                What is a good open source B-tree implementation in C? [closed]
                            
                                How to optimally solve the flood fill puzzle?
                            
                                What's the fastest way to brush up on algorithms for a technical interview (on Monday)? [closed]
                            
                                Fast way of getting the dominant color of an image [closed]
                            
                                Real world applications of Binary heaps and Fibonacci Heaps [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does heap compaction work quickly?

Tags:

algorithm

garbage-collection

Mason Wheeler

People also ask

1 Answers

Thomas Pornin

Recent Activity

Donate For Us