Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java GC overhead limit exceeded - Custom solution needed

I am evaluating different data from a textfile in a rather large algorithm.

If the text file contains more than datapoints (the minimum I need is sth. like 1.3 million datapoints) it gives the following error:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
   at java.util.regex.Matcher.<init>(Unknown Source)
   at java.util.regex.Pattern.matcher(Unknown Source)
   at java.lang.String.replaceAll(Unknown Source)
   at java.util.Scanner.processFloatToken(Unknown Source)
   at java.util.Scanner.nextDouble(Unknown Source)

When I'm running it in Eclipse with the following settings for the installed jre6 (standard VM):

-Xms20m -Xmx1024m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:NewSize=10m 
-XX:MaxNewSize=10m -XX:SurvivorRatio=6 -XX:TargetSurvivorRatio=80 
-XX:+CMSClassUnloadingEnabled

Note that it works fine if I only run through part of the textfile.

Now I've read a lot about this subject and it seems that somewhere I must have either a data leak or I'm storing too much data in arrays (which I think I do).

Now my problem is: how can I work around this? Is it possible to change my settings such that I can still perform the computation or do I really need more computational power?

like image 526
Jean-Paul Avatar asked May 31 '13 20:05

Jean-Paul


People also ask

How do you handle GC overhead limit exceeded?

OutOfMemoryError: GC overhead limit exceeded" error indicates that the NameNode heap size is insufficient for the amount of HDFS data in the cluster. Increase the heap size to prevent out-of-memory exceptions.

How do I fix GC overhead limit exceeded in eclipse?

From the root of the Eclipse folder open the eclipse. ini and change the default maximum heap size of -Xmx256m to -Xmx1024m on the last line. NOTE: If there is a lot of memory available on the machine, you can also try using -Xmx2048m as the maximum heap size.

What is Gc error?

GC Overhead Limit Exceeded Error It's thrown by the JVM when it encounters a problem related to utilizing resources. More specifically, the error occurs when the JVM spent too much time performing Garbage Collection and was only able to reclaim very little heap space.


3 Answers

The really critical vm arg is -Xmx1024m, which tells the VM to use up to 1024 megabytes of memory. The simplest solution is to use a bigger number there. You can try -Xmx2048m or -Xmx4096m, or any number, assuming you have enough RAM in your machine to handle it.

I'm not sure you're getting much benefit out of any of the other VM args. For the most part, if you tell Java how much space to use, it will be smart with the rest of the params. I'd suggest removing everything except the -Xmx param and seeing how that performs.

A better solution is to try to improve your algorithm, but I haven't yet read through it in enough detail to offer any suggestions.

like image 196
Eric Grunzke Avatar answered Oct 05 '22 23:10

Eric Grunzke


As you are saying that the data size is really very large, if it does not fit in one computers memory even after using -Xmx jvm argument, then you may want to move to cluster computing, using many computers working on your problem. For this you will have to use Message Passing Interface (MPI).

MPJ Express is a very good implementation of MPI for Java, or in languages like C/C++ there are some good implementations for MPI existing like Open MPI and mpich2. I am not sure whether it will help you in this situation, but certainly will help you in future projects.

like image 28
Sourabh Bhat Avatar answered Oct 06 '22 00:10

Sourabh Bhat


I suggest you

  • use a profiler to minimize your memory usage. I suspect you can reduce it by a factor of 10x or more by using primitives, binary data, and more compact collections.
  • increase your memory in your machine. The last time I did back testing of hundreds of signals I had 256 GB of main memory and this was barely enough at times. The more memory you can get the better.
  • use memory mapped files to increase memory efficiency.
  • Reduce the size of your data set to sometime you machine and program can support.
like image 28
Peter Lawrey Avatar answered Oct 05 '22 22:10

Peter Lawrey