Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convince Java Garbage Collector to run when working set is small?

Tags:

This is yet another "please tell me how to force the Java garbage collector to run" question. In our application, I believe we have good reasons for doing this.

This is a server application, which typically has around 5M live objects. Once every 5 minutes, we perform an analysis task which takes ~60 seconds. If a full GC is triggered while the analysis is running, there will be around 40M live objects. The extra 35M objects become garbage when the analysis completes. The server must remain responsive to requests at all times (even while the analysis is running).

We've found that a full GC takes around 1.5 seconds if invoked when the analysis is not running, but around 15 seconds while the analysis is running. Unfortunately, our allocation pattern is such that full GCs usually trigger during the analysis, even though the analysis is only running 20% of the time. (Every third or fourth analysis run triggers a full GC.)

I added code to call the much-scorned System.gc() just before beginning an analysis run, if free space in the old generation is below a certain threshold (5GB). The benefit was very substantial: we're getting 1.5 second pause times instead of 15 second pause times, and we free more garbage into the bargain. However, sometimes the System.gc() call is ignored, and we wind up with a 15-second pause a few minutes later when the GC is triggered automatically.

My question, then: is there something we can do to more strongly convince the garbage collector to run? We're running 1.7.0_09-icedtea and using the Parallel GC. I'd like either (a) a reliable way to manually force garbage collection, or (b) some way to tune the collector so that it makes a more intelligent automatic decision. (b) seems hard, as it's not clear to me how the collector could detect that our working set varies in this dramatic fashion.

I'm willing to resort to substantial hackery if need be; this is a serious issue for us. (We might look into the CMS or G1 compactors as alternatives, but I'm leery of the throughput impact of CMS, and G1 is reputed to behave poorly in the face of large byte arrays, which we use.)

addendum: In production, our experience so far has been that System.gc() usually does trigger a full garbage collection; at least, under the situations where we're calling it. (We only call it once every 10 to 30 minutes, with the heap somewhat but not completely filled with garbage.) It would be nice to be able to trigger garbage collection more reliably, but it is helping us most of the time.

like image 699
Steve Avatar asked Oct 17 '13 00:10

Steve


People also ask

How can we force a garbage collector in Java?

If you want to force garbage collection you can use the System object from the java. lang package and its gc() method or the Runtime. getRuntime(). gc() call.

How can I make my garbage collection faster?

Short of avoiding garbage collection altogether, there is only one way to make garbage collection faster: ensure that as few objects as possible are reachable during the garbage collection. The fewer objects that are alive, the less there is to be marked. This is the rationale behind the generational heap.

Can we force the garbage collector to run at any time?

Running the Garbage CollectorYou can ask the garbage collector to run at any time by calling System 's gc method: System. gc(); You might want to run the garbage collector to ensure that it runs at the best time for your program rather than when it's most convenient for the runtime system to run it.


2 Answers

Your problem is that you're running two applications with entirely different requirements and memory profiles in the same JVM.

Run the data analysis separately, in a non-user-facing process, so that the user-facing server remains constantly responsive. I assume the periodic analysis generates a summary or result data of some kind; make that available to end users by shipping it across to the user-facing server so it can be served from there, or else let your front end fetch it separately from the analysis server.

like image 117
DPM Avatar answered Sep 20 '22 06:09

DPM


Consider using non-managed memory, i.e., ByteBuffers in place of the byte arrays.

I can only offer a hack which will need some tuning and then might or might not work. I'd first try the more sane solutions. When you want to force the GC, do it by allocating a lot of memory. Do this so that the memory can be immediately reclaimed, but so that the whole allocation can't be optimized away (something like sum += new byte[123456].hashCode() should do). You'll need to find a reliable method for determining when to stop. An object with a finalizer might tell you or maybe watching runtime.getFreeMemory could help.

like image 36
maaartinus Avatar answered Sep 19 '22 06:09

maaartinus