Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full GC becoming very frequent

I've got a Java webapp running on one tomcat instance. During peak times the webapp serves around 30 pages per second and normally around 15.

My environment is:

O/S: SUSE Linux Enterprise Server 10 (x86_64)
RAM: 16GB

server: Tomcat 6.0.20
JVM: Java HotSpot(TM) 64-Bit Server VM 1.6.0_14
JVM options:
CATALINA_OPTS="-Xms512m -Xmx1024m -XX:PermSize=128m -XX:MaxPermSize=256m
               -XX:+UseParallelGC
               -Djava.awt.headless=true
               -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
JAVA_OPTS="-server"

After a couple of days of uptime the Full GC starts occurring more frequently and it becomes a serious problem to the application's availability. After a tomcat restart the problem goes away but, of course, returns after 5 to 10 or 30 days (not consistent).

The Full GC log before and after a restart is at http://pastebin.com/raw.php?i=4NtkNXmi

It shows a log before the restart at 6.6 days uptime where the app was suffering because Full GC needed 2.5 seconds and was happening every ~6 secs.

Then it shows a log just after the restart where Full GC only happened every 5-10 minutes.

I've got two dumps using jmap -dump:format=b,file=dump.hprof PID when the Full GCs where occurring (I'm not sure whether I got them exactly right when a Full GC was occurring or between 2 Full GCs) and opened them in http://www.eclipse.org/mat/ but didn't get anything useful in Leak Suspects:

  • 60MB: 1 instance of "org.hibernate.impl.SessionFactoryImpl" (I use hibernate with ehcache)
  • 80MB: 1,024 instances of "org.apache.tomcat.util.threads.ThreadWithAttributes" (these are probably the 1024 workers of tomcat)
  • 45MB: 37 instances of "net.sf.ehcache.store.compound.impl.MemoryOnlyStore" (these should be my ~37 cache regions in ehcache)

Note that I never get an OutOfMemoryError.

Any ideas on where should I look next?

like image 539
cherouvim Avatar asked Oct 27 '11 13:10

cherouvim


People also ask

What triggers a full GC?

The reason that a Full GC occurs is because the application allocates too many objects that can't be reclaimed quickly enough. Often concurrent marking has not been able to complete in time to start a space-reclamation phase.

How can we avoid major GC?

If your application's object creation rate is very high, then to keep up with it, the garbage collection rate will also be very high. A high garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create fewer objects is THE EFFECTIVE strategy to reduce long GC pauses.

What causes high garbage collection?

The root cause of high-frequency garbage collection is object churn—many objects being created and disposed of in short order. Nearly all garbage collection strategies are well suited for such a scenario; they do their job well and are fast. However, this still consumes resources.

How often does GC run?

That is, every time you allocate a small block of memory (big blocks are typically placed directly into "older" generations), the system checks whether there's enough free space in the gen-0 heap, and if there isn't, it runs the GC to free up space for the allocation to succeed.


1 Answers

When we had this issue we eventually tracked it down to the young generation being too small. Although we had given plenty of ram the young generation wasn't given it's fair share.

This meant that small garbage collections would happen more frequently and caused some young objects to be moved into the tenured generation meaning more large garbage collections also.

Try using the -XX:NewRatio with a fairly low value (say 2 or 3) and see if this helps.

More info can be found here.

like image 148
Jim Avatar answered Oct 07 '22 00:10

Jim