I have the following situation: there are a couple of machines forming a cluster. Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset. What we do currently: we now the <code>entry count</code> in the dataset and estimate the <code>memory to be used</code> as <code>entry count * empirical factor</code> (determined manually). Then check if this is lower than free memory (got by <code>Runtime.freeMemory()</code>) and if so, load it (otherwise redo the process on other nodes / report that there is no free capacity). The problems with this approach are: <ul> <li>the <code>empirical factor</code> needs to be revisited and updated manually</li> <li> <code>freeMemory</code> sometimes may underreport because of some non-cleaned-up garbage (which could be avoided by running <code>System.gc</code> before each such call, however that would slow down the sever and also potentially lead to premature promotion)</li> <li>an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it.</li> </ul> Are there better solutions to this problem?

The <code>empirical factor</code> can be calculated as build step and placed in a properties file. While <code>freeMemory()</code> is almost always less than the amount which would be free after a GC, you can check it to see if it is available and call a <code>System.gc()</code> if the <code>maxMemory()</code> indicates there might be plenty. NOTE: Using <code>System.gc()</code> in production only makes in very rare situations and in general it often incorrectly used resulting in a reduction in performance and obscuring the real problem. I would avoid triggering an OOME unless you are running is a JVM you can restart as required.

My solution: <ol> <li>Set the Xmx as <code>90%-95%</code> of RAM of physical machine if no other process is running except your program. For 32 GB RAM machine, set <code>Xmx</code> as <code>27MB - 28MB</code>.</li> <li> Use one of good gc algorithms - CMS or G1GC and fine tune relevant parameters. <code>I prefer G1GC if you need more than 4 GB RAM for your application</code>. Refer to this question if you chose G1GC: Agressive garbage collector strategy Reducing JVM pause time > 1 second using UseConcMarkSweepGC </li> <li>Calculate Cap on memory usage by yourself instead of checking free memory. Add used memory and memory to be allocated. <code>Subtract it from your own cap like 90% of Xmx</code>. If you still have available memory, grant memory allocation request.</li> </ol>

How to estimate if the JVM has enough free memory for a particular data structure?

Tags:

java

memory-management

out-of-memory

I have the following situation: there are a couple of machines forming a cluster. Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset.

What we do currently: we now the entry count in the dataset and estimate the memory to be used as entry count * empirical factor (determined manually). Then check if this is lower than free memory (got by Runtime.freeMemory()) and if so, load it (otherwise redo the process on other nodes / report that there is no free capacity).

The problems with this approach are:

the empirical factor needs to be revisited and updated manually
freeMemory sometimes may underreport because of some non-cleaned-up garbage (which could be avoided by running System.gc before each such call, however that would slow down the sever and also potentially lead to premature promotion)
an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it.

Are there better solutions to this problem?

328

asked Apr 14 '16 12:04

Grey Panther

4 Answers

The empirical factor can be calculated as build step and placed in a properties file.

While freeMemory() is almost always less than the amount which would be free after a GC, you can check it to see if it is available and call a System.gc() if the maxMemory() indicates there might be plenty.

NOTE: Using System.gc() in production only makes in very rare situations and in general it often incorrectly used resulting in a reduction in performance and obscuring the real problem.

I would avoid triggering an OOME unless you are running is a JVM you can restart as required.

148

answered Oct 29 '22 21:10

Peter Lawrey

My solution:

Set the Xmx as 90%-95% of RAM of physical machine if no other process is running except your program. For 32 GB RAM machine, set Xmx as 27MB - 28MB.
Use one of good gc algorithms - CMS or G1GC and fine tune relevant parameters. I prefer G1GC if you need more than 4 GB RAM for your application. Refer to this question if you chose G1GC:

Agressive garbage collector strategy

Reducing JVM pause time > 1 second using UseConcMarkSweepGC
Calculate Cap on memory usage by yourself instead of checking free memory. Add used memory and memory to be allocated. Subtract it from your own cap like 90% of Xmx. If you still have available memory, grant memory allocation request.

answered Oct 29 '22 22:10

Ravindra babu

An alternative approach is to isolate each data-load in its own JVM. You just predefine each JVM's max-heap-size and so on, and set the number of JVMs per host in such a way that each JVM can take up its full max-heap-size. This will use a bit more resources — it means you can't make use of every last byte of memory by cramming in more low-memory data-loads — but it massively simplifies the problem (and reduces the risk of getting it wrong), it makes it feasible to tell when/whether you need to add new hosts, and most importantly, it reduces the impact that any one client can have on all other clients.

With this approach, a given JVM is either "busy" or "available".

After any given data-load completes, the relevant JVM can either declare itself available for a new data-load, or it can just close. (Either way, you'll want to have a separate process to monitor the JVMs and make sure that the right number are always running.)

answered Oct 29 '22 20:10

ruakh

an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it.

There isn't good ways to handle and recover from OOME in JVM but there is way to react before OOM happens. Java has java.lang.ref.SoftReference which is guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError. This fact can be used for early prediction of OOM. For example data load can be aborted if prediction triggered.

Click to copy

    ReferenceQueue<Object> q = new ReferenceQueue<>();
    SoftReference<Object> reference = new SoftReference<>(new Object(), q);
    q.remove();
    // reference removed - stop data load immediately

Sensitivity can be tuned with -XX:SoftRefLRUPolicyMSPerMB flag (for Oracle JVM). Solution not ideal, it effectiveness depends on various factors - do other soft references used in code, how GC tuned, JVM version, weather on Mars... But it can help if you lucky.

answered Oct 29 '22 21:10

Andrew Kolpakov

Related questions
                            
                                error with JRMP connection establishment
                            
                                How to overlay an image or view on top of a camera captured image
                            
                                How can I determine what core a Java thread is running on?
                            
                                Why ArrayList performance differs if it is referenced as List?
                            
                                Java deserialization in C++
                            
                                MockWebServer's takeRequest() method takes long to response or hangs
                            
                                How to add .png images to pdf using Apache PDFBox
                            
                                Compile java source from String without writing to file [duplicate]
                            
                                ClassCastException: ApiVersionImpl cannot be cast to java.lang.Integer
                            
                                elasticsearch fail with error "Failed to execute phase [query_fetch], all shards failed"
                            
                                Why do we need to create loggers per class?
                            
                                (Entity-Control-Boundary pattern) -> How to deal with two entities?
                            
                                How to use different line wrapping for strings and other items in Eclipse for Java
                            
                                Connection management when using kafka producer in high traffic environment
                            
                                The Sonar way to define a constant
                            
                                Calculating average and percentiles from a histogram map?
                            
                                Mobile App Avoiding or Securing CORS?
                            
                                Libgdx - IllegalStateException at unknown location
                            
                                Why does the following code crash javac? What can be done about it?
                            
                                How to create Flow TextView

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With