We have several VMs which are running data service in production, client sends Restful HTTP requests to data service, the load is kind of heavy (500 requests per second per host in general) and load is always balanced on each VM. We have same configuration on all hosts (2 CPUs, -Xms2048m -Xmx4096m -XX:MaxPermSize=192m -XX:NewSize=512m -XX:MaxNewSize=512M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError
)
Two days ago, we saw old gen heap usage start growing on 5 of those VMs (300 MB per day), old gen heap usage on other ones stay the same (around 80 MB), we are trying to identify the root cause, may I ask if this is a memory leak issue or just a normal situation? Does growth in old gen memory usage always mean memory leak in Java?
Thanks.
Update: We just restarted those 5 hosts yesterday, old gen heap usage on all of them went back to normal as other ones, however, after the peak load we had this morning, the old gen heap usage on one of them started growing again...
Does growth in old gen memory usage always mean memory leak in java?
Not necessarily.
The concurrent mark sweep garbage collector does not compact the old gen during collection. So under sufficient memory load, it's possible to get a large amount of fragmentation, making it impossible to reclaim enough memory to allow promotion of tenured objects into the old gen space.
Try turning on these params and see what's going on:
-XX:+PrintGCDetails -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=1
Look for promotion failures and frequent full GC sweeps that fail to free up a lot of memory.
If you're using Java 7 or higher you can try switching to the G1 collector (-XX:+UseG1GC instead of -XX:+UseConcMarkSweepGC). This is a compacting collector which avoids some of the above issues.
If you're still running into problems after that, then I'd look to your code to see if something is hanging onto object references when it shouldn't.
Edit: since this is happening on some hosts and not others, I'd lean towards a code issue, perhaps related to unexpected user input, since it only occurs sporadically.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With