Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with 150GB heap in non interactive application

Hello I am having a case of 150GB heap memory program using In Memory Data grid. I have some crazy requirement from the operational department to use a single machine. Now we all know what happens in if the parallel garbage collector is used over 150GB probably it will be tens of minutes of garbage collection if the FULL GC is invoked.

My hope was that with Java 9 is coming Shenandoah low pause GC. Unfortunately from what I see it is not listed for delivery in Java 9. Does anyone knows anything about that ?

Never the less, I am wondering how G1 GC will perform for this amount of Heap memory.

And one last question. Since I have non interactive batch application that is supposed to complete in 2 hours lets say. The main goal here is to ensure that the Full GC never kicks in. If I ensure that there is plenty of memory lets say if the maximum heap that can be reached is 150 and I allocate it 250GB may I say with good confidence that the Full GC will never kick in or ? Usually full GC is triggered if the new generation + the old generation touches the maximum heap. Can it be triggered in a different way ?

There is a duplicate request made I will try to explain here why this question is not a duplicate. First we are talking about 150GB Heap which adds completely different dimension to the question. Second I dont use RMI as it is in the question mentioned, third I am asking question about G1 garbage collector in between the lines.Also once we go beyond the 32GB heap barrier we are entering the 64 bit address space you can not convince me that a question in regards of <32GB Heap is the same as a question with heap >32GB Not to mentioned that things have changed a bit since Java 7 for instance PermSpace does not exist.

like image 772
Alexander Petrov Avatar asked Jul 09 '16 20:07

Alexander Petrov


People also ask

What do you do when your heap memory is full?

The Heap and the Nursery When the heap becomes full, garbage is collected. During the garbage collection objects that are no longer used are cleared, thus making space for new objects.

Does heap size affect performance?

The amount of heap memory allocated to the Java virtual machine can impact performance. For example, if the Xmx value is too low (is set lower than the amount of live data in the JVM), it will force frequent garbage collections in order to free up the space (RAM).

What triggers major garbage collection?

The Old Generation is used to store long surviving objects. Typically, a threshold is set for young generation object and when that age is met, the object gets moved to the old generation. Eventually the old generation needs to be collected. This event is called a major garbage collection.

When major GC runs?

In one environment, the heap usage graph is a slow sawtooth with major GCs every 10 hours or so, only when the heap is >90% full. In another environment, the JVM does major GCs every hour on the dot (the heap is normally between 10% and 30% at these times).


1 Answers

The rule of thumb for a compacting GC is that it should be able to process 1 GB of live objects per core per second.

Example on an Haswell i7 (4 cores/8 threads) and 20GB heap with the parallel collector:

[24.757s][info][gc,heap        ] GC(109) PSYoungGen: 129280K->0K(917504K)
[24.757s][info][gc,heap        ] GC(109) ParOldGen: 19471666K->7812244K(19922944K)
[24.757s][info][gc             ] GC(109) Pause Full (Ergonomics) 19141M->7629M(20352M) (23.791s, 24.757s) 966.174ms
[24.757s][info][gc,cpu         ] GC(109) User=6.41s Sys=0.02s Real=0.97s

The live set after compacting is 7.6GB. It takes 6.4 seconds worth of cpu-time, due to parallelism this translates to <1s pause time.

In principle the parallel collector should be able to handle a 150GB heap with full GC times < ~2 minutes on a multi-core system, even when most of the heap consists of live objects.

Of course this is just a rule of thumb. Some things that can affect it negatively:

  • paging
  • thermal CPU throttling
  • workloads consisting of very large, reference-heavy objects
  • non-local memory traffic in NUMA configurations
  • other processes competing for CPU time
  • heavy use of weak/soft references

In some cases tuning may be necessary to achieve this throughput.

If the Parallel collector does not work despite all that then CMS and G1 can be viable alternatives but only if there is enough spare heap capacity and CPU cores available to the JVM. They need significant breathing room to do their concurrent work without risking a full GC.

It is correct I said no interactive, but still I have a strict license agreements. I need to be finished with the whole processing in an hour. So I can no afford 30 minutes stop the world event.

Basically, you don't really need low pause times in the sense that CMS, G1, Shenandoah or Zing aim for (they aim for <100ms or even <10ms even on large heaps).

All you need is that STW pauses are not so catastrophically bad that they eat a significant portion of your compute time.

This should be feasible with most of the available collectors, ignoring the serial one.

In practice there are some pathological edge cases where they may fall down, but to get to that point you need setup a system with your actual workload and do some test runs. If you experience some real problems, then you can ask a question with more details.

like image 194
the8472 Avatar answered Sep 30 '22 15:09

the8472