Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summary of the last decade of garbage collection? [closed]

I've been reading through the Jones & Lins book on garbage collection, which was published in 1996.

Obviously, the computing world has changed dramatically since then: multicore, out-of-order chips with large caches, and even larger main memory in desktops. The world has also more-or-less settled on the x86 and ARM microarchitectures for most consumer-facing systems.

What have the most significant advances been since the seminal book was published?

I'm looking in particular for pointers to papers, algorithms, dissertations, and the like, representing advances in both the theory & practice of garbage collection.

like image 654
Ben Karel Avatar asked May 26 '10 17:05

Ben Karel


2 Answers

GC Advancements on the JVM:

G1 of the JVM seems to bring some new improvements on the table (for the JVM atleast)

G1 is a “server-style” GC and has the following attributes.

Parallelism and Concurrency. G1 takes advantage of the parallelism that exists in hardware today. It uses all available CPUs (cores, hardware threads, etc.) to speed up its “stop-the-world” pauses when an application's Java threads are stopped to enable GC. It also works concurrently with running Java threads to minimize whole-heap operations during stop-the-world pauses.

Generational. Like the other HotSpot GC's, G1 is generational, meaning it treats newly-allocated (aka young) objects and objects that have lived for some time (aka old) differently. It concentrates garbage collection activity on young objects, as they are the ones most likely to be reclaimable, while visiting old objects infrequently. For most Java applications, generational garbage collection has major efficiency advantages over alternative schemes.

Compaction. Unlike CMS, G1 performs heap compaction over time. Compaction eliminates potential fragmentation problems to ensure smooth and consistent long-running operation.

Predictability. G1 is expected to be more predictable than CMS. This is largely due to the elimination of fragmentation issues that can negatively affect stop-the-world pause times in CMS. Additionally, G1 has a pause prediction model that, in many situations, allows it to often meet (or rarely exceed) a pause time target.

G1 Link

HotSpot 6 Seems to have numerous garbage collectors you can choose from.

like image 183
bakkal Avatar answered Dec 03 '22 21:12

bakkal


As far as I know, most advances on garbage collection techniques in the last decade were on the "practical" side: algorithms were known, but some considerable tuning was performed with regards to multi-core systems and observed usage patterns. A substantial part of that research was done by Sun and IBM, in the context of Java (it is striking that most of the usage pattern analyses presented in the Jones & Lins book is about Lisp and its singly-linked lists; nowadays papers talk about Java). The G1 algorithm from Sun is built upon older ideas which are all in the Jones & Lins book -- but the people at Sun (now Oracle) worked hard to find out which combination was most efficient.

There was also much research on distributed garbage collection -- how to GC-manage data objects which are scattered over distinct systems, within the usual challenging conditions of distributed computing: network is slow, nodes may not be equivalent to each other, some nodes may fail. The overall conclusion seems to be that it does not work (there was much more research than findings). Limited versions with reference counting (for references to objects located on another system) have been implemented (e.g. in Java's RMI) and appear to work in contexts where there is no cycle of references across nodes.

like image 26
Thomas Pornin Avatar answered Dec 03 '22 19:12

Thomas Pornin