Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GC pauses get really long after several days

I am running a build system. We used to use CMS collector, but we started suffering under very long full GC cycles, throughput (time not doing GC) was around 90%. So I now decided to switch to G1 with the assumtion that even if I have longer overall GC time, the pauses will be shorter hence ensuring higher availability. So this idea seemed to work even better than I expeced, I was seeing no full GC for almost 3 days, throughput was 97%, overall GC performance was way better. (All screenshots and data got from GCViewer)

Normal

Until now (day 6). Today the system simply went berzerk. Old space utilized is just barely under 100%. I am seeing Full GC triggered almost every 2-3 minutes or so: Berzerk!

Old space utilization: Old space

Heap size is 20G (128G Ram total). The flags I am currently using are:

-XX:+UseG1GC
-XX:MaxPermSize=512m
-XX:MaxGCPauseMillis=800
-XX:GCPauseIntervalMillis=8000 
-XX:NewRatio=4
-XX:PermSize=256m
-XX:InitiatingHeapOccupancyPercent=35
-XX:+ParallelRefProcEnabled

plus logging flags. What I seem to be missing is -XX:+ParallelGCThreads=20 (I have 32 processors), default should be 8. I have also read from oracle that it would be suggested to have -XX:+G1NewSizePercent=4 for 20G heap, default should be 5.

I am using Java HotSpot(TM) 64-Bit Server VM 1.7.0_76, Oracle Corporation

What would you suggest? Do I have obvious mistakes? What to change? Am I do greedy by giving Java only 20G? The assumption here is that giving it too much heap would mean longer GC as there is simply more to clean (peasant logic).

PS: Application is not mine. For me its a box-product.

like image 521
Erki M. Avatar asked Mar 06 '15 14:03

Erki M.


People also ask

What causes long GC pause?

Garbage collection (GC) is the process by which Java removes data that is no longer needed from memory. A garbage collection pause, also known as a stop-the-world event, happens when a region of memory is full and the JVM requires space to continue. During a pause all operations are suspended.

How do I stop long GC pauses?

If your application's object creation rate is very high, then to keep up with it, the garbage collection rate will also be very high. A high garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create fewer objects is THE EFFECTIVE strategy to reduce long GC pauses.

What is GC pause duration?

Metronome garbage collector (GC) pause time can be fine-tuned for each Java™ process. By default, the Metronome GC pauses for 3 milliseconds in each individual pause, which is known as a quantum.

Why does Major GC take so long?

Large Heap size Large heap size (-Xmx) can also cause long GC pauses. If heap size is quite high, then more garbage will be get accumulated in the heap. When Full GC is triggered to evict the all the accumulated garbage in the heap, it will take long time to complete.


1 Answers

What would you suggest? Do I have obvious mistakes? What to change? Am I do greedy by giving Java only 20G? The assumption here is that giving it too much heap would mean longer GC as there is simply more to clean (peasant logic).

If it triggers full GCs but your occupancy stays near those 20GB then it's possible that the GC simply does not have enough breathing room, either to meet the demand of huge allocations or or to meet some of its goals (throughput, pause times), forcing full GCs as a fallback.

So what you can attempt is increasing the heap limit or relaxing the throughput goals.

As mentioned earlier in my comment you can also try upgrading to java8 for improved G1 heuristics.

For further advice GC logs covering the "berzerk" behavior would be useful.

like image 128
the8472 Avatar answered Sep 27 '22 19:09

the8472