I have a Spring batch program that reads a file (example file I am working with is ~ 4 GB in size), does a small amount of processing on the file, and then writes it off to an Oracle database.
My program uses 1 thread to read the file, and 12 worker threads to do the processing and database pushing.
I am churning lots and lots and lots of young gen memory, which is causing my program to go slower than I think it should.
JDK 1.6.18
Spring batch 2.1.x
4 Core Machine w 16 GB ram
-Xmx12G
-Xms12G
-NewRatio=1
-XX:+UseParallelGC
-XX:+UseParallelOldGC
With these JVM params, I get somewhere around 5.x GB of memory for Tenured Generation, and around 5.X GB of memory for Young Generation.
In the course of processing this one file, my Tenured Generation is fine. It grows to a max of maybe 3 GB, and I never need to do a single full GC.
However, the Young Generation hits it's max many times. It goes up to 5 GB range, and then a parallel minor GC happens and clears Young Gen down to 500MB used. Minor GCs are good and better than a full GC, but it still slows down my program a lot (I am pretty sure the app still freezes when a young gen collection occurs, because I see the database activity die off). I am spending well over 5% of my program time frozen for minor GCs, and this seems excessive. I would say over the course of processing this 4 GB file, I churn through 50-60GB of young gen memory.
I don't see any obvious flaws in my program. I am trying to obey the general OO principles and write clean Java code. I am trying not to create objects for no reason. I am using thread pools, and whenever possible passing objects along instead of creating new objects. I am going to start profiling the application, but I was wondering if anyone had some good general rules of thumb or anti patterns to avoid that lead to excessive memory churn? Is 50-60GB of memory churn to process a 4GB file the best I can do? Do I have to revert to JDk 1.2 tricks like Object Pooling? (although Brian Goetz give a presentation that included why object pooling is stupid, and we don't need to do it anymore. I trust him a lot more than I trust myself .. :) )
I have a feeling that you are spending time and effort trying to optimize something that you should not bother with.
I am spending well over 5% of my program time frozen for minor GCs, and this seems excessive.
Flip that around. You are spending just under 95% of your program time doing useful work. Or put it another way, even if you managed to optimize the GC to run in ZERO time, the best you can get is something over 5% improvement.
If your application has hard timing requirements that are impacted by the pause times, you could consider using a low-pause collector. (Be aware that reducing pause times increases the overall GC overheads ...) However for a batch job, the GC pause times should not be relevant.
What probably matters most is the wall clock time for the overall batch job. And the (roughly) 95% of the time spent doing application specific stuff is where you are likely to get more pay-off for your profiling / targeted optimization efforts. For example, have you looked at batching the updates that you send to the database?
So.. 90% of my total memory is in char[] in "oracle.sql.converter.toOracleStringWithReplacement"
That would tend to indicate that most of your memory usage occurs in the Oracle JDBC drivers while preparing stuff to be sent to the database. There's very little you about that. I'd chalk it up as an unavoidable overhead.
It would be really usefull if you clarify your terms "young" and "tentured" generation because Java 6 has a slightly different GC-Model: Eden, S0+S1, Old, Perm
Have you experimented with the different garbage collection algorithms? How has "UseConcMarkSweepGC" or "UseParNewGC" performed.
And don't forget simply increasing the available space is NOT the solution, because a gc run will take much longer, decrease the size to normal values ;)
Are you sure you have no memory-leaks? In a consumer-producer-pattern - you describe - rarely seldom data should be in the Old Gen because those jobs are proccessed really fast and then "thrown away", or is your work queue filling up?
You should defintely observe your program with a memory analyzer.
I think a session with a memory profiler will shed a lot of light on the subject. This gives a nice overview how many objects are created and this is somtimes revealing.
I am always amazed how many strings are generated.
For domain objects crossreferencing them is also revealing. If you see suddenly 3 times more objects from a derived object than from the source then there something going on there.
Netbeans has a nice one built it. I used JProfiler in the past. I think if you bang long enough on eclipse you can get the same info from the PPTP tools.
You need to profile your application to see what is happening exactly. And I would also try first to use the ergonomics feature of the JVM, as recommended:
2. Ergonomics
A feature referred to here as ergonomics was introduced in J2SE 5.0. The goal of ergonomics is to provide good performance with little or no tuning of command line options by selecting the
- garbage collector,
- heap size,
- and runtime compiler
at JVM startup, instead of using fixed defaults. This selection assumes that the class of the machine on which the application is run is a hint as to the characteristics of the application (i.e., large applications run on large machines). In addition to these selections is a simplified way of tuning garbage collection. With the parallel collector the user can specify goals for a maximum pause time and a desired throughput for an application. This is in contrast to specifying the size of the heap that is needed for good performance. This is intended to particularly improve the performance of large applications that use large heaps. The more general ergonomics is described in the document entitled “Ergonomics in the 5.0 Java Virtual Machine”. It is recommended that the ergonomics as presented in this latter document be tried before using the more detailed controls explained in this document.
Included in this document are the ergonomics features provided as part of the adaptive size policy for the parallel collector. This includes the options to specify goals for the performance of garbage collection and additional options to fine tune that performance.
See the more detailed section about Ergonomics in the Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning guide.
In my opinion, the young generation should not be equally big as the old generation, so that the small garbage collections stay fast.
Do you have many objects that represent the same value? If you do, merge these duplicate objects using a simple HashMap
:
public class MemorySavingUtils {
ConcurrentHashMap<String, String> knownStrings = new ConcurrentHashMap<String, String>();
public String unique(String s) {
return knownStrings.putIfAbsent(s, s);
}
public void clear() {
knownStrings.clear();
}
}
With the Sun Hotspot compiler, the native String.intern()
is really slow for large numbers of Strings, that's why I suggest to build your own String interner.
Using this method, strings from the old generation are reused and strings from the new generation can be garbage collected quickly.
Read a line from a file, store as a string and put in a list. When the list has 1000 of these strings, put it in a queue to be read by worker threads. Have said worker thread make a domain object, peel a bunch of values off the string to set the fields (int, long, java.util.Date, or String), and pass the domain object along to a default spring batch jdbc writer
if that's your program, why not set a smaller memory size, like 256MB?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With