Recently we started using New Relic to monitor our production webapp hosted in tomcat 7.0.6 server but we have observed that memory footprint of this tomcat is increasing continuously and within a week it eats up all the server(AWS High-Memory Double Extra Large Instance) memory and become unresponsive, only way to get it back is by restarting it. We provide Xms & Xmx arguments while starting the tomcat but within few hours memory usage of tomcat process cross Xmx value and it keeps on increasing until all the server memory is over. Here is process command:
/usr/java/jdk1.6.0_24//bin/java
-Djava.util.logging.config.file=/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/conf/logging.properties
-Xms8192m
-Xmx8192m
-javaagent:/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/newrelic/newrelic.jar
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
-Duser.timezone=Asia/Calcutta
-Djava.endorsed.dirs=/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/endorsed
-classpath /xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/bin/bootstrap.jar:/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/bin/tomcat-juli.jar
-Dcatalina.base=/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6
-Dcatalina.home=/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6
-Djava.io.tmpdir=/xxx/xxx/xxx/xxx/apache-tomcat-7.0.6/temp org.apache.catalina.startup.Bootstrap start"
Ideally I would expect this process not to use more than 8GB of memory but within hours it goes above 10GB and within few days it goes above 20GB and everything else on this server suffers because of it(I use 'top' to see memory usage). How is this possible?
There's an issue which affects any Sun/Oracle JVM and will manifest as unbounded growth in non-heap (native) memory. There is a workaround in place for New Relic Java agent versions 2.16+ by adding a shutdown delay to class transformation in your newrelic.yml file in the common section.
class_transformer:
shutdown_delay: 3600
From the changelog
Work-around for Oracle JVM bug that in rare cases causes a native memory leak
In rare cases, the Oracle JVM can leak native OS memory (not heap space) when classes are intercepted by the agent. This setting turns off interception of classes that are loaded after the given number of seconds. The agent will continue to monitor classes loaded before this time.
I am sharing some more information on above reported incident. memory leak is not in Java heap. The application never reaches any OUT OF MEMORY error(8 gb is the Java heap max limit what we have set). However the virtual and resident memory keep on increasing till the time RAM runs out of memory. We have confirmed that this leak happens when relic agent is used. Version : New Relic Agent v2.1.2
Sorry for the trouble. We (New Relic) are investigating the problem but the first suggestion is to please try the latest 2.2.1 version of the Java Agent which made substantial changes to the way we instrument classes.
I will follow-up here when we have more information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With