Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the Garbage-First Garbage Collector work?

Can someone explain how the G1 Garbage Collector works please? I haven't been able to find any comprehensive, easy-to-understand descriptions anywhere yet.

Thanks

like image 296
dogbane Avatar asked May 21 '10 12:05

dogbane


People also ask

How does G1 garbage collector work?

G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. This evacuation is performed in parallel on multi-processors, to decrease pause times and increase throughput.

How does the garbage collector works?

In the common language runtime (CLR), the garbage collector (GC) serves as an automatic memory manager. The garbage collector manages the allocation and release of memory for an application. For developers working with managed code, this means that you don't have to write code to perform memory management tasks.

Why do we call garbage-first?

After the marking phase completes, G1 knows which regions are mostly empty. It collects these regions first, which often yields a large amount of free space. This is why this method of garbage collection is called Garbage-First.

What is the primary role of the garbage collector?

Garbage collection (GC) is a dynamic approach to automatic memory management and heap allocation that processes and identifies dead memory blocks and reallocates storage for reuse. The primary purpose of garbage collection is to reduce memory leaks.


2 Answers

The collector splits the heap up into fixed-size regions and tracks the live data in those regions. It keeps a set of pointers — the "remembered set" — into and out of the region. When a GC is deemed necessary, it collects the regions with less live data first (hence, "garbage first"). Often, this can mean collecting an entire region in one step: if the number of pointers into a region is zero, then it doesn't need to do a mark or sweep of that region.

For each region, it tracks various metrics that describe how long it will take to collect them. You can give it a soft real-time constraint about pause times, and it then tries to collect as much garbage as it can in that constrained time.

There is JavaOne talk about G1 and few articles on the topic:

  • http://developers.sun.com/learning/javaoneonline/j1sessn.jsp
  • http://www.fasterj.com/articles/G1.shtml
  • http://www.drdobbs.com/java/219401061
  • http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All
like image 122
Eugene Kuleshov Avatar answered Sep 19 '22 06:09

Eugene Kuleshov


The G1 is nicely explained also in this new JavaOne 2012 session : G1 Garbage Collector Performance Tuning [youtube], [PDF].

They start with introduction of CMS and G1, their comparison, and then the G1 analysis and tuning is explained.

G1 characteristics

  • Fixed Size Regions - heap split into regions (1Mb - 32MB, ~2000, determined by the VM).
  • Eden, survivor and OldGen represented as a logical set of regions.
  • Live objects are evacuated from one region to another

A typical G1 heap may look like:

A typical G1 heap may look like:

Here is a summary of each G1 phase:

1. Young Collection

1.1 Young Phase - Minor GC

  • Evacuation - Stop-The-World Parallel minor GC where live objects are evacuated from Young Generation either into Survivor regions (tenuring) or OldGen regions (promotion).
  • Accounting - size of eden/survivor space for the next young GC is determined based on stats from each region and based on the pause-time target set by the app. G1 estimates how much time it will need for the next YoungGC.
  • Resizing - G1 can now easily reduce/resize eden/survivor regions.

1.2 Young / Initial Mark

  • GC young initial-mark is an initial marking phase for the OldGen collection that is executed in parallel with the YoungGC collection. The initial mark is a parallel concurrent marking process.

2. Old Gen Collection

2.1 Initial Mark - see 1.2.

2.2 GC remark

  • one stop-the-world pause, concurrenlty marking live objects
  • accounting - for each region, during remark, G1 is tracking liviness of the region(how many objects are live in each region), and references into the region (Remembered Set) - this tells G1 how long is takes to do a collection on this region.
  • Empty Regions reclaimed

2.3. GC pause (mixed)

  • choose regions with low liviness and collect some of them. Hence, we are collecting the "garbage first".
  • the actual collection of these regions is performed at the same time as the next Young GC, so there is no separate pause for the collection of the OldGen. Therefore, the GC pause (mixed) is a mixed collection of YoungGen and a portion of old Gen.
  • At the end of GC pause (mixed) there can be some garbage left in old gen regions, this will be collected later depending of future liviness, pause time target and number of unused regions.

3. Full GC

Note that G1 is intended to avoid Full GC as much as possible. As of Java 7u40, FullGC pauses in G1 are not optimized and are implemented as a single threaded operation. When using G1, try to avoid Full GC - if you see any FullGC pauses, your GC setup probably requires some tuning.

Resources

  • G1: One Garbage Collector To Rule Them All
  • JavaOne 2012 session : G1 Garbage Collector Performance Tuning [youtube], [PDF]
  • Garbage Collection in Java (4) - Garbage First
  • Garbage-First Garbage Collection - the original paper.
like image 26
Aleš Avatar answered Sep 18 '22 06:09

Aleš