Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

frequent major gc but not free heap?

After running several hours, My http server begin frequenly major gc, but no heap was freed.

several times major gc later, promotion failed and concurrent mode failure occured, then heap was freed. My gc log is below :

{Heap before GC invocations=7172 (full 720):
 par new generation   total 737280K, used 667492K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000)
  eden space 655360K, 100% used [0x000000076b800000, 0x0000000793800000, 0x0000000793800000)
  from space 81920K,  14% used [0x0000000793800000, 0x00000007943d91d0, 0x0000000798800000)
  to   space 81920K,   0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000)
 concurrent mark-sweep generation total 1482752K, used 1479471K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000)
 concurrent-mark-sweep perm gen total 131072K, used 58091K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)
2015-11-19T21:50:02.692+0800: 113963.532: [GC2015-11-19T21:50:02.692+0800: 113963.532: [ParNew (promotion failed)
Desired survivor size 41943040 bytes, new threshold 15 (max 15)
- age   1:    3826144 bytes,    3826144 total
- age   2:     305696 bytes,    4131840 total
- age   3:     181416 bytes,    4313256 total
- age   4:     940632 bytes,    5253888 total
- age   5:      88368 bytes,    5342256 total
- age   6:     159840 bytes,    5502096 total
- age   7:     733856 bytes,    6235952 total
- age   8:      64712 bytes,    6300664 total
- age   9:     314304 bytes,    6614968 total
- age  10:     587160 bytes,    7202128 total
- age  11:      38728 bytes,    7240856 total
- age  12:     221160 bytes,    7462016 total
- age  13:     648376 bytes,    8110392 total
- age  14:      33296 bytes,    8143688 total
- age  15:     380768 bytes,    8524456 total
: 667492K->665908K(737280K), 0.7665810 secs]2015-11-19T21:50:03.459+0800: 113964.299: [CMS2015-11-19T21:50:05.161+0800: 113966.001: [CMS-concurrent-mark: 3.579/4.747 secs] [Times: user=13.41 sys=0.35, rea
l=4.75 secs] 
 (concurrent mode failure): 1479910K->44010K(1482752K), 4.7267420 secs] 2146964K->44010K(2220032K), [CMS Perm : 58091K->57795K(131072K)], 5.4939440 secs] [Times: user=9.07 sys=0.13, real=5.49 secs] 
Heap after GC invocations=7173 (full 721):
 par new generation   total 737280K, used 0K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000)
  eden space 655360K,   0% used [0x000000076b800000, 0x000000076b800000, 0x0000000793800000)
  from space 81920K,   0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000)
  to   space 81920K,   0% used [0x0000000793800000, 0x0000000793800000, 0x0000000798800000)
 concurrent mark-sweep generation total 1482752K, used 44010K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000)
 concurrent-mark-sweep perm gen total 131072K, used 57795K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)
}  

It seems the CMS GC doesn't make any sense. Could you please explain to me ?

This is my gc config:

/usr/local/jdk1.7.0_79/bin/java 
-server 
-Xms2248m 
-Xmx2248m 
-Xmn800m 
-XX:PermSize=128m 
-XX:MaxPermSize=128m 
-XX:MaxTenuringThreshold=15 
-XX:+UseCMSCompactAtFullCollection 
-XX:CMSFullGCsBeforeCompaction=0 
-XX:+UseConcMarkSweepGC 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 
-XX:+PrintGCDateStamps 
-Xloggc:gc.log 
-XX:+PrintHeapAtGC 
-XX:+PrintTenuringDistribution 
-XX:+UseFastAccessorMethods

UPDATE

There is a periodic task since the server start. Its job is load data from mysql and keep in jvm heap. When client request comes, the server should compute with the data. The code of task like this :

private volatile List<ActivityInfo> activityInfos;

public void run () {
    activityInfos = db.loadActivity();
}

public ActivityInfo getActivityByClient() {
    //
    List<ActivityInfo> local = activityInfos; 
    // biz code
    ActivityInfo response = // biz code
    return response;
}

// executor 
executor.scheduleWithFixedDelay(task, 0, 1, TimeUnit.MINUTES);

What confused me most is why heap was freed after full gc, not after major gc?

UPDATE

full gc log is here

like image 317
znlyj Avatar asked Nov 19 '15 14:11

znlyj


3 Answers

That would indicate that you're running very close to your maximum heap size, so there are frequent GCs but very little being freed. Try increasing it significantly, say by 1.5x or 2x.

like image 171
user207421 Avatar answered Oct 26 '22 07:10

user207421


For resolving it, you can use Eclipse Memory Analyzer. It will show you all these memory and thread related issues in detail. You can also jConsole for it.

like image 20
mayank agrawal Avatar answered Oct 26 '22 09:10

mayank agrawal


Your nursery heap is set to 800MB with -Xmn800m but the nursery heap usage after collection is just 8MB

- age 15: 380768 bytes, 8524456 total

So your application can run a long time just garbage collecting the nursery heap. However at some point the tenured heap will fill up. By collection 7172 there is only about 3MB left - the difference between the total and the used heap values.

concurrent mark-sweep generation total 1482752K, used 1479471K

The Garbage Collector has noted that the tenured heap is close to capacity and a concurrent mark will have been triggered before the start of the log activity posted. During the concurrent mark phase, objects continue to be allocated and the nursery heap fills up triggering a nursery collection.

When the nursery is collected, there is no longer enough space in the tenured heap to hold the objects which are to be promoted to the tenured heap. This results in the promotion failed and the garbage collector is forced to process the whole heap, not just the nursery. As this occurs before the concurrent mark phase has completed, a concurrent mode failure is also logged. After the full collection there is 1.4GB in the tenured heap.

concurrent mark-sweep generation total 1482752K, used 44010K

This is the way that things are meant to happen. Nursery collections are cheap if most new objects quickly go out of scope and the JVM will happily only collect the nursery for as long as it can. Eventually though the tenured heap fills up and the more expensive full GC is required.

What would happen if you reduced the nursery, say to half its size? Assuming that your application is creating objects at a constant rate, the nursery would fill up in about half the time. Since the amount of data in use by the application is independent of GC, the same number of objects would be retained and promoted as with the larger nursery. So the tenured collection would fill up faster as well.

With the same total heap size however. the tenured area is bigger than before, so it will take more nursery collections to fill up the tenured area, so there are trade offs. A good rule of thumb is to size the nursery at a quarter the size of the tenured area.

UPDATE

The full gc.log is from a different GC run but I guess that the application behaviour is similar. In it I see many CMS: abort preclean due to time messages. These are described in Jon Masamitsu's Weblog. For reasons of efficiency, the CMS collector relies on a nursery collection to happen before stopping all the executable threads. If such a collection does not happen within a certain amount of time, the CMS collection is aborted.

When application load is low, but tenured heap usage is high, the CMS collector will start to run and go through its initial mark phase. When a nursery collection fails to run, the CMS collection is aborted. This may happen several times. Then a nursery heap occurs and the cycle repeats. This will carry on until either a CMS and nursery collection coincide or the tenured heap fills up completely.

As objects are only being promoted to the tenured heap slowly, this behaviour can persist for some time. Here it lasts from 2015-11-24T00:28:23.921 to 2015-11-24T01:55:52.461 - an hour and a half. During this period, time is wasted performing an initial mark only to abort the operation.

There are a number of possibilities to fix this behaviour.

  • Reduce the size of the nursery. This will increase the rate of nursery collections and so they will coincide with CMS collections more often. However the large nursery seems to work well and reducing it will cause more CMS collections with worse performance under heavy load.
  • Increase CMSMaxAbortablePrecleanTime. This will mean that CMS will wait for longer before aborting the collection. However the longer the wait, the more expensive the CMS collection will be
  • Turn on CMSScavengeBeforeRemark. This is my recommendation. This will force a nursery collection at just the right time and the CMS collection will never be aborted. The pause time for the remark phase will be longer as a nursery collection will happen as well, but as the extra time is small and full collections so rare, this is unlikely to be an issue.

Note that the nursery heap is sometimes called the young generation and the tenured heap the old generation. For more information take a look at Understanding Garbage Collection or Garbage Collection Basics

like image 26
Neil Masson Avatar answered Oct 26 '22 08:10

Neil Masson