Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tomcat8 shutdown randomly with AbstractProtocol.pause

1. the problem

I deployed a webapp in Tomcat, and I found that it shutdowns randomly, the time varies from 2 or 3 hours to 2 or 3 days. The log in catalina.out is:

26224 2015-06-10 13:59:04.110 {http-nio-8080-exec-3} INFO  com.timediff.controller.user.UserProfileController#getUserHome - /user/profile/home done, curUid: 889
26225 10-Jun-2015 14:15:35.050 INFO [Thread-11] org.apache.coyote.AbstractProtocol.pause Pausing ProtocolHandler ["http-nio-8080"]
26226 10-Jun-2015 14:15:35.052 INFO [Thread-11] org.apache.coyote.AbstractProtocol.pause Pausing ProtocolHandler ["ajp-nio-8009"]
26227 10-Jun-2015 14:15:35.053 INFO [Thread-11] org.apache.catalina.core.StandardService.stopInternal Stopping service Catalina
26228 10-Jun-2015 14:15:35.058 INFO [localhost-startStop-2] org.springframework.web.context.support.XmlWebApplicationContext.doClose Closing WebApplicationContext for namespace 'timediff-dispatcher-servlet': startup date [Wed Jun 10 13:38:14 CST 2015]; root of context hierarchy
26229 10-Jun-2015 14:15:35.059 INFO [localhost-startStop-2] org.springframework.context.support.DefaultLifecycleProcessor.stop Stopping beans in phase 2147483647
26230 2015-06-10 14:15:35.061 {localhost-startStop-2} INFO  org.quartz.core.QuartzScheduler#standby - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 paused.
26231 10-Jun-2015 14:15:35.072 INFO [localhost-startStop-2] org.springframework.scheduling.quartz.SchedulerFactoryBean.destroy Shutting down Quartz Scheduler
26232 2015-06-10 14:15:35.072 {localhost-startStop-2} INFO  org.quartz.core.QuartzScheduler#shutdown - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 shutting down.
26233 2015-06-10 14:15:35.075 {localhost-startStop-2} INFO  org.quartz.core.QuartzScheduler#standby - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 paused.
26234 2015-06-10 14:15:35.077 {localhost-startStop-2} INFO  org.quartz.core.QuartzScheduler#shutdown - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 shutdown complete.
26235 10-Jun-2015 14:15:35.082 INFO [localhost-startStop-2] org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.shutdown Shutting down ExecutorService 'quartzThreadPool'
26236 2015-06-10 14:15:35.103 {localhost-startStop-2} INFO  com.timediff.listener.StopMemoryLeakListener#lambda$contextDestroyed$0 - driver: com.mysql.jdbc.Driver@7657b26d is de-registered.
26237 2015-06-10 14:15:35.104 {localhost-startStop-2} INFO  com.timediff.listener.StopMemoryLeakListener#contextDestroyed - AbandonedConnectionCleanupThread shutdown.
26238 10-Jun-2015 14:15:35.150 INFO [Thread-11] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["http-nio-8080"]
26239 10-Jun-2015 14:15:35.152 INFO [Thread-11] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["ajp-nio-8009"]
26240 10-Jun-2015 14:15:35.154 INFO [Thread-11] org.apache.coyote.AbstractProtocol.destroy Destroying ProtocolHandler ["http-nio-8080"]
26241 10-Jun-2015 14:15:35.156 INFO [Thread-11] org.apache.coyote.AbstractProtocol.destroy Destroying ProtocolHandler ["ajp-nio-8009"]

on stackoverflow, this question and this question is very similar to my situation, but I'm still stumbled.

Now I'll give a detailed description of my problem:

2. more detail

2.1 tomcat and jdk version

Tomcat: 8.0.22
JDK: 1.8.0_45

2.2 jvm options in catalina.sh:

CATALINA_OPTS="-server -Xms1g -Xmx1g -XX:MaxMetaspaceSize=512m -Xmn512m 
-XX:SurvivorRatio=8 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+ScavengeBeforeFullGC 
-XX:+CMSScavengeBeforeRemark
-XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails 
-Xloggc:/opt/logs/gc/timediff-gc.log 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
-Dsun.net.inetaddr.ttl=120  -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/opt/logs/gc/timediff-oom.hprof  
-Djava.rmi.server.hostname=**.**.**.**
-Dcom.sun.management.jmxremote.port=1099 
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false"

2.3 there are no exception logs in my webapp related to the abortion of tomcat, and I'm sure I never called System.exit(), and there are no code blocks like:

try {

} catch(Exception e) {
    // do nothing
}

2.4 while I actually found Allocation Failure in gc log:

2015-06-10T15:36:28.589+0800: 3099.795: [GC (Allocation Failure) 3099.795: [ParNew: 419780K->382K(471872K), 0.0125816 secs] 469721K->50348K(996160K), 0.0126820 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:37:30.141+0800: 3161.347: [GC (Allocation Failure) 3161.347: [ParNew: 419838K->372K(471872K), 0.0062445 secs] 469804K->50338K(996160K), 0.0063629 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:38:41.680+0800: 3232.886: [GC (Allocation Failure) 3232.886: [ParNew: 419828K->369K(471872K), 0.0064920 secs] 469794K->50356K(996160K), 0.0066009 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:39:43.222+0800: 3294.428: [GC (Allocation Failure) 3294.428: [ParNew: 419825K->384K(471872K), 0.0058772 secs] 469812K->50372K(996160K), 0.0059823 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:40:54.758+0800: 3365.964: [GC (Allocation Failure) 3365.964: [ParNew: 419840K->388K(471872K), 0.0056674 secs] 469828K->50395K(996160K), 0.0069850 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]

I think that maybe this is the cause, but the result of TOP and jvisualVM makes it unclear:

enter image description hereenter image description here

web@iZu1skaofy1Z:/usr/local/apache-tomcat-8.0.22/logs$ free -m
             total       used       free     shared    buffers     cached
Mem:          3951       3087        864          0        190        553
-/+ buffers/cache:       2343       1608
Swap:            0          0          0

top - 15:50:05 up 16 days,  5:11,  2 users,  load average: 0.33, 0.17, 0.09
Tasks: 128 total,   2 running, 126 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  0.5 sy,  0.0 ni, 98.5 id,  0.0 wa,  0.2 hi,  0.0 si,  0.0 st
KiB Mem:   4046820 total,  3161260 used,   885560 free,   194880 buffers
KiB Swap:        0 total,        0 used,        0 free.   566984 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
27307 web       20   0 2068604 865872  22048 S   0.7 21.4  20:20.28 java
16557 web       20   0 3680756 801708  13740 S   0.0 19.8   2:02.99 java
15597 mysql     20   0 1800972 526220   6636 S   0.0 13.0  36:26.08 mysqld

2.4 I deployed another tomcat on the same server, but I changed the shutdown port and connector port, I don't think that they are conflict.

I've tried my best, maybe I forget something during the analysis, please help to give me some tips, thanks in advance!

update(2015-07-04): after I switch from user web to user root when running tomcat, the problem never occurs. So I doubt that tomcat is kill by system because of user privilege, if you have any idea, please tell me, thanks!

like image 731
Michael Avatar asked Jun 10 '15 14:06

Michael


1 Answers

This answer (from one of the questions you found) seems good.

Something is telling Tomcat to stop. And since it doesn't happen when Tomcat is run as root, I think the cause is some other (non-system) process (maybe a script or a cron job) sending a signal (probably, SIGTERM) to Tomcat, like kill <tomcat pid>. Maybe that other process is also run as user web - that would explain why that process can't kill root's Tomcat. Or maybe that other process just searches for processes to kill, and one of the criteria is "processes owned by web".

I suggest you carefully read crontabs for users root and web, system-wide crontab and everything in /etc/cron.*/ folders. You can also check if any other processes owned by web are suddenly terminated. And building Tomcat from source, with some tracing added (as suggested in the answer I mentioned), seems a good idea.

like image 122
Roman Avatar answered Sep 19 '22 23:09

Roman