Why is the Java G1 gc spending so much time scanning RS?

Tags:

I'm currently evaluating the G1 garbage collector and how it performs for our application. Looking at the gc-log, I noticed a lot of collections have very long "Scan RS" phases:

7968.869: [GC pause (mixed), 10.27831700 secs]
   [Parallel Time: 10080.8 ms]
   (...)
      [Scan RS (ms):  4030.4  4034.1  4032.0  4032.0
       Avg: 4032.1, Min: 4030.4, Max: 4034.1, Diff:   3.7]
      [Object Copy (ms):  6038.5  6033.3  6036.7  6037.1
       Avg: 6036.4, Min: 6033.3, Max: 6038.5, Diff:   5.2]
   (...)
    [Eden: 19680M(19680M)->0B(20512M) Survivors: 2688M->2624M Heap:
     75331M(111904M)->51633M(115744M)]
 [Times: user=40.49 sys=0.02, real=10.28 secs]

All the removed log-rows entries show runtimes in single-digit ms.

I think most of the time should be spent in copying, right? What could be the reason Scan RS takes so long? Any ideas on how to tweak the G1-settings? The JVM was started with

-Xms40960M -Xmx128G -XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log

Edit: Oh, I forgot... I'm using Java 7u25

Update: I noticed two other weird things:

16187.740: [GC concurrent-mark-start]
16203.934: [GC pause (young), 2.89871800 secs]
(...)
16218.455: [GC pause (young), 4.61375100 secs]
(...)
16237.441: [GC pause (young), 4.46131800 secs]
(...)
16257.785: [GC pause (young), 4.73922600 secs]
(...)
16275.417: [GC pause (young), 3.87863400 secs]
(...)
16291.505: [GC pause (young), 3.72626400 secs]
(...)
16307.824: [GC pause (young), 3.72921700 secs]
(...)
16325.851: [GC pause (young), 3.91060700 secs]
(...)
16354.600: [GC pause (young), 5.61306000 secs]
(...)
16393.069: [GC pause (young), 17.50453200 secs]
(...)
16414.590: [GC concurrent-mark-end, 226.8497670 sec]

The concurrent GC run is continuing while parallel runs are being performed. I'm not sure if that's intended, but it kinda seems wrong to me. Admittedly, this is an extreme example, but I do see this behaviour all over my log.

Another thing is that my JVM process grew to 160g. Considering a heap-size of 128g, that's a rather large overhead. Is this to be expected, or is G1 leaking memory? Any ideas on how to find that out?

PS: I'm not really sure if I should've made new questions for the updates... if any of you think that this would be beneficial, tell me ;)

Update 2: I guess the G1 really may be leaking memory: http://printfdebugger.tumblr.com/post/19142660766/how-i-learned-to-love-cms-and-had-my-heart-broken-by-g1 As this is a deal-breaker for now, I'm not going to spend more time on playing with this. Things I didn't yet try is configuring region size (-XX:G1HeapRegionSize) and lowering the heap occupancy (-XX:InitiatingHeapOccupancyPercent).

907

asked Jul 12 '13 12:07

JohannesR

1 Answers

Let's see.

1 - First clues

It looks like your GC was configured to use 4 threads (or you have 4 vCPU, but it is unlikely given the size of the heap). It is quite low for a 128GB heap, I was expecting more.

The GC events seems to happen at 25+ seconds interval. However, the log extract you gave do not mention the number of regions that were processed.

=> By any chance, did you specify pause time goals to G1GC (-XX:MaxGCPauseMillis=N) ?

2 - Long Scan RSet time

"Scan RSet" means the time the GC spent in scanning the Remembered Sets. Remembered Set of a region contains cards that correspond to the references pointing into that region. This phase scans those cards looking for the references pointing into all the regions of the collection set.

So here, we have one more question :

=> How many regions were processed during that particular collection (i.e. how big is the CSet)

3 - Long Object Copy time

The copy time, as the name suggest, is the time spend by each worker thread copying live objects from the regions in the Collection Set to the other regions.

Such long copy time can suggest that a lot of regions were processed, and that you may want to reduce that number. It could also suggest swapping, but this is very unlikely given your user/real values at the end of the log.

4 - Now what to do

You should check in the GC log the number of regions that were processed. Correlate this number with your region size and deduce the amount of memory that was scanned.

You can then set a smaller pause time goal (for instance, to 500ms using -XX:MaxGCPauseMillis=500). This will

increase the number of GC events,
reduce the amount of freed memory per GC cycle
reduce the STW pauses during YGC

Hope that helps !

Sources :

https://blogs.oracle.com/poonam/entry/understanding_g1_gc_logs
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html
http://jvm-options.tech.xebia.fr/

108

answered Sep 21 '22 06:09

Pierre Laporte

Related questions
                            
                                Pass zero-sized array, save allocation?
                            
                                is this a java memory leak
                            
                                Java: how to store references to all instances of a class?
                            
                                Vertical Scaling of Java real-time application
                            
                                java garbage collecting in dialog
                            
                                Managing Memory with Python Reading Objects of Varying Sizes from OODB's
                            
                                Why does v8 run out of memory in this situation?
                            
                                JavaScript WeakMap keep referencing gc'ed objects
                            
                                What is the default Full GC interval in Java 8
                            
                                GC output clarification
                            
                                Will the Java Garbage Collector stop my POSIX threads created from JNI calls?
                            
                                Custom WeakReference implementation
                            
                                internal reference prevents garbage collection
                            
                                What are some behavioral anomalies with data structures that use reference counting in STL?
                            
                                Why GC doesn't collect my objects?
                            
                                Object deletes reference to self
                            
                                VB.net Garbage collector not releasing objects
                            
                                Optimizing Tomcat / Garbage Collection
                            
                                Why does PHP's garbage collector slow down performance, and how to manage memory without it?
                            
                                How to find gc roots in a stack machine?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is the Java G1 gc spending so much time scanning RS?

Tags:

garbage-collection

jvm-hotspot

g1gc

JohannesR

People also ask

1 Answers

Pierre Laporte

Recent Activity

Donate For Us