I am evaluating different data from a textfile in a rather large algorithm.
If the text file contains more than datapoints (the minimum I need is sth. like 1.3 million datapoints) it gives the following error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Matcher.<init>(Unknown Source)
at java.util.regex.Pattern.matcher(Unknown Source)
at java.lang.String.replaceAll(Unknown Source)
at java.util.Scanner.processFloatToken(Unknown Source)
at java.util.Scanner.nextDouble(Unknown Source)
When I'm running it in Eclipse with the following settings for the installed jre6 (standard VM):
-Xms20m -Xmx1024m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:NewSize=10m
-XX:MaxNewSize=10m -XX:SurvivorRatio=6 -XX:TargetSurvivorRatio=80
-XX:+CMSClassUnloadingEnabled
Note that it works fine if I only run through part of the textfile.
Now I've read a lot about this subject and it seems that somewhere I must have either a data leak or I'm storing too much data in arrays (which I think I do).
Now my problem is: how can I work around this? Is it possible to change my settings such that I can still perform the computation or do I really need more computational power?
OutOfMemoryError: GC overhead limit exceeded" error indicates that the NameNode heap size is insufficient for the amount of HDFS data in the cluster. Increase the heap size to prevent out-of-memory exceptions.
From the root of the Eclipse folder open the eclipse. ini and change the default maximum heap size of -Xmx256m to -Xmx1024m on the last line. NOTE: If there is a lot of memory available on the machine, you can also try using -Xmx2048m as the maximum heap size.
GC Overhead Limit Exceeded Error It's thrown by the JVM when it encounters a problem related to utilizing resources. More specifically, the error occurs when the JVM spent too much time performing Garbage Collection and was only able to reclaim very little heap space.
The really critical vm arg is -Xmx1024m
, which tells the VM to use up to 1024 megabytes of memory. The simplest solution is to use a bigger number there. You can try -Xmx2048m
or -Xmx4096m
, or any number, assuming you have enough RAM in your machine to handle it.
I'm not sure you're getting much benefit out of any of the other VM args. For the most part, if you tell Java how much space to use, it will be smart with the rest of the params. I'd suggest removing everything except the -Xmx
param and seeing how that performs.
A better solution is to try to improve your algorithm, but I haven't yet read through it in enough detail to offer any suggestions.
As you are saying that the data size is really very large, if it does not fit in one computers memory even after using -Xmx
jvm argument, then you may want to move to cluster computing, using many computers working on your problem. For this you will have to use Message Passing Interface (MPI
).
MPJ Express
is a very good implementation of MPI
for Java, or in languages like C/C++ there are some good implementations for MPI
existing like Open MPI
and mpich2
. I am not sure whether it will help you in this situation, but certainly will help you in future projects.
I suggest you
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With