Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unnecessary Full GC with the G1 garbage collector in Java 8?

We noticed occasional full GC’s with G1 garbage collector with concurrent-mark overflow. Once, there is a concurrent-mark-reset-for-overflow, this overflow will continue in the next concurrent mark phases. Eventually, it leads to the full GC since the concurrent mark seems no longer working.

We have four machines running the same Apache Storm based application with the same data traffic. Only one of the machines has this experience once in a week.

Is this related to the bug: ‘G1 does not expand marking stack when mark stack overflow happens during concurrent marking’ https://bugs.openjdk.java.net/browse/JDK-8065402

According to the suggestion from the above page, we doubled the concurrent mark threads from 4 to 8 and our heap size from 8GB to 16GB. However, the full GC still happens and the only difference is that the occurrences are delayed.

Any other suggestions?

Here's the GC log:

Java HotSpot(TM) 64-Bit Server VM (25.65-b01) for linux-amd64 JRE(1.8.0_65b17), 
built on Oct  6 2015 17:16:12 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) 
Memory: 4k page, physical 529167668k(69283408k free), swap 33554424k(33552380k free) 
CommandLine flags: -XX:ConcGCThreads=8 -XX:G1ReservePercent=20 -XX:GCLogFileSize=104857600 
-XX:InitialHeapSize=17179869184 -XX:InitiatingHeapOccupancyPercent=45 -XX:MaxGCPauseMillis=100 
-XX:MaxHeapSize=17179869184 -XX:NumberOfGCLogFiles=10 -XX:ParallelGCThreads=30 
-XX:+PrintAdaptiveSizePolicy -XX:PrintFLSStatistics=2 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime 
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC 
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation
...
...
2016-04-13T22:06:37.254-0400: 19839.175: [GC concurrent-root-region-scan-start]
2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-root-region-scan-end, 0.0592966 secs]
2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-mark-start]
2016-04-13T22:06:38.569-0400: 19840.490: [GC concurrent-mark-reset-for-overflow]
...
2016-04-13T22:06:42.810-0400: 19844.731: [GC concurrent-mark-reset-for-overflow]
...
2016-04-13T22:11:19.253-0400: 20121.175: [GC concurrent-mark-reset-for-overflow]
...
...
...
2016-04-14T01:58:17.254-0400: 33739.176: [GC concurrent-mark-reset-for-overflow]
...
2016-04-14T01:58:36.957-0400: 33758.878: [Full GC (Allocation Failure)
like image 453
Jeff Avatar asked Apr 15 '16 14:04

Jeff


People also ask

How does the Java 8 G1 garbage collector work?

G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target. The regions identified by G1 as ripe for reclamation are garbage collected using evacuation.

What is Java 8 default garbage collector?

2) What is default garbage collector for Java 8 ? For server class machine (with at least 2 processors and at least 2 GB of physical memory) - The default garbage collector is the parallel collector.

What triggers full GC Java?

Common triggers for garbage collection are Eden space being full, not enough free space to allocate an object, external resources like System. gc(), tools like jmap or not enough free space to create an object.

What causes a full GC?

The reason that a Full GC occurs is because the application allocates too many objects that can't be reclaimed quickly enough. Often concurrent marking has not been able to complete in time to start a space-reclamation phase.


1 Answers

From oracle g1_gc blog:

GC concurrent-mark-reset-for-overflow : This indicates that the global marking stack had became full and there was an overflow of the stack. Concurrent marking detected this overflow and had to reset the data structures to start the marking again

So increasing -XX:MarkStackSize is one quick win.

Few observation from your VM parameters:

  1. The G1 GC is an adaptive garbage collector with defaults that enable it to work efficiently without modification. Have a quick look at oracle documentation page on G1GC
  2. Key parameters to set : -XX:MaxGCPauseMillis, -XX:G1HeapRegionSize,-XX:ParallelGCThreads=n, -XX:ConcGCThreads=n Leave everything else to default values.
  3. If your heap size is 16 GB, the ideal region size should be 8 MB. Make sure that you maintain 2048 regions.
  4. Revisit your pause time goal. -XX:MaxGCPauseMillis. If 200ms is unrealistic for 16 GB heap, set this value as properly.
  5. Official documentation page recommends the way to set XX:ParallelGCThreads=n, -XX:ConcGCThreads=n depending on number of cores in your machine.

    -XX:ParallelGCThreads=n: Sets the value of the STW worker threads. Sets the value of n to the number of logical processors. The value of n is the same as the number of logical processors up to a value of 8.

    -XX:ConcGCThreads=n:Sets the number of parallel marking threads. Sets n to approximately 1/4 of the number of parallel garbage collection threads (ParallelGCThreads).

  6. Revisit -XX:InitialHeapSize=17179869184 -XX:InitiatingHeapOccupancyPercent=45 -XX:G1ReservePercent=20 parameters. Leave them to default values unless you have pressing need to change them.

Visit this page for better understanding of G1GC logs.

like image 148
Ravindra babu Avatar answered Oct 08 '22 16:10

Ravindra babu