I am running a build system. We used to use CMS collector, but we started suffering under very long full GC cycles, throughput (time not doing GC) was around 90%. So I now decided to switch to G1 with the assumtion that even if I have longer overall GC time, the pauses will be shorter hence ensuring higher availability. So this idea seemed to work even better than I expeced, I was seeing no full GC for almost 3 days, throughput was 97%, overall GC performance was way better. (All screenshots and data got from GCViewer)
Until now (day 6). Today the system simply went berzerk. Old space utilized is just barely under 100%. I am seeing Full GC triggered almost every 2-3 minutes or so:
Old space utilization:
Heap size is 20G (128G Ram total). The flags I am currently using are:
-XX:+UseG1GC
-XX:MaxPermSize=512m
-XX:MaxGCPauseMillis=800
-XX:GCPauseIntervalMillis=8000
-XX:NewRatio=4
-XX:PermSize=256m
-XX:InitiatingHeapOccupancyPercent=35
-XX:+ParallelRefProcEnabled
plus logging flags. What I seem to be missing is -XX:+ParallelGCThreads=20
(I have 32 processors), default should be 8. I have also read from oracle that it would be suggested to have -XX:+G1NewSizePercent=4
for 20G heap, default should be 5.
I am using Java HotSpot(TM) 64-Bit Server VM 1.7.0_76, Oracle Corporation
What would you suggest? Do I have obvious mistakes? What to change? Am I do greedy by giving Java only 20G? The assumption here is that giving it too much heap would mean longer GC as there is simply more to clean (peasant logic).
PS: Application is not mine. For me its a box-product.
Garbage collection (GC) is the process by which Java removes data that is no longer needed from memory. A garbage collection pause, also known as a stop-the-world event, happens when a region of memory is full and the JVM requires space to continue. During a pause all operations are suspended.
If your application's object creation rate is very high, then to keep up with it, the garbage collection rate will also be very high. A high garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create fewer objects is THE EFFECTIVE strategy to reduce long GC pauses.
Metronome garbage collector (GC) pause time can be fine-tuned for each Java™ process. By default, the Metronome GC pauses for 3 milliseconds in each individual pause, which is known as a quantum.
Large Heap size Large heap size (-Xmx) can also cause long GC pauses. If heap size is quite high, then more garbage will be get accumulated in the heap. When Full GC is triggered to evict the all the accumulated garbage in the heap, it will take long time to complete.
What would you suggest? Do I have obvious mistakes? What to change? Am I do greedy by giving Java only 20G? The assumption here is that giving it too much heap would mean longer GC as there is simply more to clean (peasant logic).
If it triggers full GCs but your occupancy stays near those 20GB then it's possible that the GC simply does not have enough breathing room, either to meet the demand of huge allocations or or to meet some of its goals (throughput, pause times), forcing full GCs as a fallback.
So what you can attempt is increasing the heap limit or relaxing the throughput goals.
As mentioned earlier in my comment you can also try upgrading to java8 for improved G1 heuristics.
For further advice GC logs covering the "berzerk" behavior would be useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With