I'm running Tomcat 7.0.53 on CentOS 6.5 64 bit and OpenJDK 1.7 64.
I have several servers where occasionally - like randomly once per week - Tomcat will just gracefully shut itself down. I've been looking at this for months and cannot find the reason. The only pattern seems to be a long period of nothing going on then the shutdown.
Here is the memory usage of one of the servers. Lots of RAM free.
total used free shared buffers cached
Mem: 2006 771 1234 0 176 281
-/+ buffers/cache: 313 1692
Swap: 2047 0 2047
Below is one of the events in the Catalina.out. You can see it starts then does nothing exciting for several hours. Then gracefully shuts itself down as if it was told to.
I've researched this to death and have not bee able to get a handle on this.
Can someone please propose a plan of action for me?
Thanks
From the Server.xml:
<Server port="-1" shutdown="__SHUTDOWN__">
From Catalina.out:
Apr 28, 2014 5:34:50 PM org.apache.tomcat.util.digester.SetPropertiesRule begin
WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Valve} Setting property 'remoteIpProxiesHeader' to 'x-forwarded-by' did not find a matching property.
Apr 28, 2014 5:34:50 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 5:34:50 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 5:34:50 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 3247 ms
Apr 28, 2014 5:34:50 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Catalina
Apr 28, 2014 5:34:50 PM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.53
Apr 28, 2014 5:34:59 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 5:34:59 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 5:34:59 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 8278 ms
Apr 28, 2014 5:41:53 PM org.apache.jasper.compiler.TldLocationsCache tldScanJar
INFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 10:32:32 PM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service Catalina
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["ajp-apr-8009"]
May 05, 2014 8:10:32 PM org.apache.catalina.core.AprLifecycleListener init
INFO: Loaded APR based Apache Tomcat Native library 1.1.29 using APR version 1.3.9.
May 05, 2014 8:10:32 PM org.apache.catalina.core.AprLifecycleListener init
INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
May 05, 2014 8:10:32 PM org.apache.tomcat.util.digester.SetPropertiesRule begin
Something is telling Tomcat to shut down.
Tomcat does not gracefully shut down when it's "in trouble" (out of memory, or anything else). Linux does not gracefully shut down processes when it's low on memory, it stops them hard with a kill -9, which leave no trace in the Tomcat log.
kill -15
will shut down tomcat gracefully, via the shut down hook. (The answer previously said kill -3
here, which is simply causing a thread dump.)
An internal exit() will shut it down gracefully as well, again via the shut down hook.
It can be shut down via JMX, and, of course the shut down command from Catalina.
So, someone, somewhere, is telling Tomcat to do this. This isn't some unknown, internal "problem". Those kind of things just kill tomcat outright, they don't ask politely.
One thing you might try is to enable DEBUG for logging, to see what chatter you get. If you get too much, you can try looking specifically at org.apache.catalina.core.StandardContext
. This may well not give you more than it already being logged, but it may tell you something.
After that, if so motivated, and it continues, I'd suggest building Tomcat yourself, and interjecting some stack trace dumps.
For example, in org.apache.coyote.AbstractProtcol
, you can add:
Exception e = new Exception("Shut down trace");
getLog().info("Shutting down protocol", e);
to the stop()
method.
That will give you a solid stack trace to learn from whence this came. Tomcat isn't really a nasty event storm of just random messages floating around. The shut down is pretty synchronous. So, that stack trace will give you a solid lead as to who started the party. Then you can look from there to see how that could have happened.
There's a bunch of interfaces and abstractions in Tomcat, but really only a few implementations, so it's not impenetrable. And having a smoking stack trace will help immensely in pinning it down.
A case in DC/OS
Recently we had a similar incident. Our tomcat
is running in a docker as a service in a DC/OS
environment. The stderr just said tomcat was killed, and there was no useful traces in any log files of tomcat.
It turned out the reason was simply that we did not allocate enough memory to the service. More debugging suggestions are here, but the key technique we used is to check the kernel log:
# journalctl -f _TRANSPORT=kernel
...
Mar 29 15:09:09 host-17 kernel: Memory cgroup out of memory: Kill process 7935 (java) score 1021 or sacrifice child
Mar 29 15:09:09 host-17 kernel: Killed process 7906 (java) total-vm:11583840kB, anon-rss:1032860kB, file-rss:30924kB, shmem-rss:0kB
...
Once we found that out, we increased the memory allocation to the service in DC/OS and the problem was solved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With