My app is geo application. Due to a requirement of short response time each of my instance load all points to memory and store them in a structure (quad tree).
Every minute we load all points (to be sync with the db) and put them in few quad trees.
We now have 0.5GB points. I am trying to prepare to the next level of 5GB points. JVM: -XX:NewSize=6g -Xms20g -Xmx20g -XX:+UseConcMarkSweepGC -verboseGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails
Startup of the instance took lots of times due to GC in additional the app suffer from GC all the time.
I would like to have some reference to GC with large heap.
I can think of few solutions for this:
To refresh only the changes in the db and not load all db each time. Cons - still will suffer from GC during the early stage of the application, Long time GC.
Off heap solution. Store in the quad tree the point ids and store the points out of heap. Cons - serialization time, the geo structure is a complex of few objects and not simple object.
For each instance create additional instance with the structure and query against this instance . The geo instance will hold long lived object and can have GC adjustment for long lived object. Cons - complexity and response time.
Any references to article on apps that hold few GIG of long lived objects will be more than welcome.
Run on Ubuntu (Amazon). Java 7 . Dosnt have memory limitation.
The problem is long pause time each time we refresh the data.
Gc log for the refresh:
2014-06-15T16:32:58.551+0000: 1037.469: [GC2014-06-15T16:32:58.551+0000: 1037.469: [ParNew: 5325855K->259203K(5662336K), 0.0549830 secs] 16711893K->11645244K(20342400K), 0.0551490 secs] [Times: user=0.71 sys=0.00, real=0.05 secs]
2014-06-15T16:33:02.383+0000: 1041.302: [GC2014-06-15T16:33:02.383+0000: 1041.302: [ParNew: 5292419K->470768K(5662336K), 0.0851740 secs] 16678460K->11856811K(20342400K), 0.0853260 secs] [Times: user=1.09 sys=0.00, real=0.09 secs]
2014-06-15T16:33:06.114+0000: 1045.033: [GC2014-06-15T16:33:06.114+0000: 1045.033: [ParNew: 5503984K->629120K(5662336K), 1.5475170 secs] 16890027K->12193877K(20342400K), 1.5476760 secs] [Times: user=5.49 sys=0.61, real=1.55 secs]
2014-06-15T16:33:11.145+0000: 1050.063: [GC2014-06-15T16:33:11.145+0000: 1050.063: [ParNew: 5662336K->558612K(5662336K), 0.7742870 secs] 17227093K->12758866K(20342400K), 0.7744610 secs] [Times: user=3.88 sys=0.82, real=0.77 secs]
2014-06-15T16:33:11.920+0000: 1050.838: [GC [1 CMS-initial-mark: 12200254K(14680064K)] 12761216K(20342400K), 0.1407080 secs] [Times: user=0.13 sys=0.01, real=0.14 secs]
2014-06-15T16:33:12.061+0000: 1050.979: [CMS-concurrent-mark-start]
2014-06-15T16:33:14.208+0000: 1053.127: [CMS-concurrent-mark: 2.148/2.148 secs] [Times: user=19.46 sys=0.44, real=2.15 secs]
2014-06-15T16:33:14.208+0000: 1053.127: [CMS-concurrent-preclean-start]
2014-06-15T16:33:14.232+0000: 1053.150: [CMS-concurrent-preclean: 0.023/0.023 secs] [Times: user=0.14 sys=0.01, real=0.02 secs]
2014-06-15T16:33:14.232+0000: 1053.150: [CMS-concurrent-abortable-preclean-start]
2014-06-15T16:33:15.629+0000: 1054.548: [GC2014-06-15T16:33:15.630+0000: 1054.548: [ParNew: 5591828K->563654K(5662336K), 0.1279360 secs] 17792082K->12763908K(20342400K), 0.1280840 secs] [Times: user=1.65 sys=0.00, real=0.13 secs]
2014-06-15T16:33:19.143+0000: 1058.062: [GC2014-06-15T16:33:19.143+0000: 1058.062: [ParNew: 5596870K->596692K(5662336K), 0.3445070 secs] 17797124K->13077191K(20342400K), 0.3446730 secs] [Times: user=3.06 sys=0.34, real=0.35 secs]
CMS: abort preclean due to time 2014-06-15T16:33:19.832+0000: 1058.750: [CMS-concurrent-abortable-preclean: 5.124/5.600 secs] [Times: user=35.91 sys=1.67, real=5.60 secs]
To reduce GC times, the best thing you can do is use off heap memory. If you can move as much of your large data as possible you can reduce your full GC time to as low as 10 milli-second even with 100s of MB of off heap memory.
High heap usage occurs when the garbage collection process cannot keep up. An indicator of high heap usage is when the garbage collection is incapable of reducing the heap usage to around 30%. In the image above you can see normal sawtooth of JVM heap.
It is possible. Just use the Windows API functions WriteProcessMemory/ReadProcessMemory . Pass in the handle of the process and the pointer to the data.
People have made some very good comments on design points above that you should definitely consider. If you consider GC implications only your question is a little bit tricky to answer, as Garbage Collection tuning is not an exact science. What you may wish to consider is that actual GC algorithm that you are using. But a little background first:
Garbage collection is generational and it's performance relies on the high infant mortality rate of objects - that is objects are created and rapidly destroyed during the application flow. You are holding onto large objects (due to the caching nature), that means you will be filling up Eden and GC will be required to promote those objects to survivor and tenured spaces.
This would mean that should you ever need to reclaim this memory it is likely to be in tenured and therefore require a longer GC pause time to collect. However, given the size of your heap this might not be a problem as it may never need to do a full GC. A healthy application is expected to frequently GC for a very short amount of time to collect Eden, so don't get too distracted by the garbage collector kicking in. However do get concerned by a Full GC. If you haven't got you garbage collector logs turned on now is the time to do so, as this will allow you to measure and not make assumptions if you are already.
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
Also see this answer for more details.
From Java 7 update 4 you can use the G1 garbage collector. This algorithm is specifically designed to work with large heaps on multicore machines. When you start your application you would use something like this command:
java -Xmx50m -Xms50m -XX:+UseG1GC -XX:MaxGCPauseMillis=200
This allows you to specify your heap size and a target performance that you are looking to achieve. G1 works slightly different to the mark and sweep algorithm used by default in that it looks across the heap at different regions and tries to claim areas that have mostly garbage in. Fully reclaimable segments of the heap are cheap to reclaim so it will find as many of these as possible and collect them within your desired pause time. It does as much of this as possible in parallel leveraging your multicores to avoid wasting time during stop the world.
It is certainly not guaranteed that G1 is going to be the silver bullet for your problem, but with your traits you might find it works really well for you (as things that are not needed in the cache can be removed easier and are likely to be located closely due to principal of locality). There is some beginner information on G1 here. Also take a look at this article that goes into a lot more depth that I have here.
I know this is an old question but I thought it would be wort adding that if your application is using a lot of caching, you can benefit from a larger young generation size. For some reason, this is rarely recommended but I've had some really good results with this in a caching use case. The best place for objects to die is in the young generation. There is basically no cost to collecting objects in young. The worst thing that can happen is that you have objects that live just long enough to move to the tenured space and then promptly die.
There are two ways around this in hotspot.
If you have a lot of memory to play with, I would try allocating a very large amount to the young generation. Try giving the young generation at least twice what you expect to be alive in the cache at any time. Leave the tenuring threshold at 2 at first.
Using a setup like this, we've been running an cache server (Infinispan) for over a month with no major collections. Minor collections run about every hour and max out at 0.2 seconds pause to collect 6-7 GB of garbage. The live set is approximately 1GB at any given time. It's this live set that will primarily determines the time it will take so increasing the young generation further will reduce collections but should not change the pause times much.
In an application that is doing other things aside from caching, this might not work perfectly. In my case the tenured space is holding steady at around 120MB. Ideally, you'll have enough space for your long-lived objects and only have to move stuff to tenured once. Depending on the application, this might be difficult to pull off.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With