first thing first. my system info and versions:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 13.04
Release: 13.04
Codename: raring
$ sudo docker version
Client version: 0.9.0
Go version (client): go1.2.1
Git commit (client): 2b3fdf2
Server version: 0.9.0
Git commit (server): 2b3fdf2
Go version (server): go1.2.1
$ lxc-version
lxc version: 0.9.0
$ uname -a
Linux ip-10-0-2-86 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
I am not able to stop a container after the process inside of it becomes a zombie. After upgrading to to docker 0.9.0 I was seeing tons of zombies on my server. example:
$ ps axo stat,ppid,pid,comm | grep -w defunct
Zl 25327 25332 node <defunct>
$ pstree -p
init(1)─┬
├─sh(819)───docker(831)─┬
├─lxc-start(25327)───node(25332)───{node}(25378)
I can see that lxc-start(25327)
not calling wait() on the node process 25332 keeping to zombie alive. So I checked what it was doing with strace and it seemed to be stuck on a epoll_wait
. stract actually gets stuck at first and just shows this:
$sudo strace -ir -ttt -T -v -p 25327
Process 25327 attached - interrupt to quit (when asked to kill)
0.000103 [ 7fe59b9d34b3] epoll_wait(8,
but after I run a sudo docker kill 3da5764b7bc9358 I get more output:
0.000103 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN, {u32=21673408, u64=21673408}}}, 10, 4294967295) = 1 <8.935002>
8.935097 [ 7fe59bcaff60] accept(4, 0, NULL) = 9 <0.000035>
0.000095 [ 7fe59bcafeb3] fcntl(9, F_SETFD, FD_CLOEXEC) = 0 <0.000027>
0.000083 [ 7fe59b9d401a] setsockopt(9, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 <0.000027>
0.000089 [ 7fe59b9d347a] epoll_ctl(8, EPOLL_CTL_ADD, 9, {EPOLLIN, {u32=21673472, u64=21673472}}) = 0 <0.000023>
0.000087 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN, {u32=21673472, u64=21673472}}}, 10, 4294967295) = 1 <0.000026>
0.000090 [ 7fe59bcb0130] recvmsg(9, {msg_name(0)=NULL, msg_iov(1)=[{"\3\0\0\0\0\0\0\0", 8}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=773, uid=0, gid=0}}, msg_flags=0}, 0) = 8 <0.000034>
0.000128 [ 7fe59bcb019d] sendto(9, "\0\0\0\0\0\0\0\0\364b\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24, 0, NULL, 0) = 24 <0.000029>
0.000090 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN|EPOLLHUP, {u32=21673472, u64=21673472}}}, 10, 4294967295) = 1 <0.000018>
0.000091 [ 7fe59bcb0130] recvmsg(9, {msg_name(0)=NULL, msg_iov(1)=[{"\3\0\0\0\0\0\0\0", 8}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=0, gid=0}}, msg_flags=0}, 0) = 0 <0.000026>
0.000122 [ 7fe59b9d347a] epoll_ctl(8, EPOLL_CTL_DEL, 9, NULL) = 0 <0.000037>
0.000084 [ 7fe59bcafd00] close(9) = 0 <0.000048>
0.000103 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN, {u32=21673408, u64=21673408}}}, 10, 4294967295) = 1 <1.091839>
1.091916 [ 7fe59bcaff60] accept(4, 0, NULL) = 9 <0.000035>
0.000093 [ 7fe59bcafeb3] fcntl(9, F_SETFD, FD_CLOEXEC) = 0 <0.000027>
0.000083 [ 7fe59b9d401a] setsockopt(9, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 <0.000026>
0.000090 [ 7fe59b9d347a] epoll_ctl(8, EPOLL_CTL_ADD, 9, {EPOLLIN, {u32=21673504, u64=21673504}}) = 0 <0.000032>
0.000100 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN, {u32=21673504, u64=21673504}}}, 10, 4294967295) = 1 <0.000028>
0.000088 [ 7fe59bcb0130] recvmsg(9, {msg_name(0)=NULL, msg_iov(1)=[{"\3\0\0\0\0\0\0\0", 8}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=774, uid=0, gid=0}}, msg_flags=0}, 0) = 8 <0.000030>
0.000125 [ 7fe59bcb019d] sendto(9, "\0\0\0\0\0\0\0\0\364b\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24, 0, NULL, 0) = 24 <0.000032>
0.000119 [ 7fe59b9d34b3] epoll_wait(8, {{EPOLLIN|EPOLLHUP, {u32=21673504, u64=21673504}}}, 10, 4294967295) = 1 <0.000071>
0.000139 [ 7fe59bcb0130] recvmsg(9, {msg_name(0)=NULL, msg_iov(1)=[{"\3\0\0\0\0\0\0\0", 8}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=0, gid=0}}, msg_flags=0}, 0) = 0 <0.000018>
0.000112 [ 7fe59b9d347a] epoll_ctl(8, EPOLL_CTL_DEL, 9, NULL) = 0 <0.000028>
0.000076 [ 7fe59bcafd00] close(9) = 0 <0.000027>
0.000096 [ 7fe59b9d34b3] epoll_wait(8,
then I looked at what epoll_wait was waiting which looks like file 8 (i am guessing this from epoll_wait(8, {{EPOLLIN, {u32=21673408, u64=21673408}}}, 10, 4294967295) = 1 <8.935002>
which is of the form int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
$ cat /proc/25327/fdinfo/8
pos: 0
flags: 02000002
tfd: 7 events: 19 data: 14ab830
tfd: 4 events: 19 data: 14ab5c0
also adding 7 and 4 based on tfd above (not sure what tfd really means)
$ cat /proc/25327/fdinfo/4
pos: 0
flags: 02000002
$ cat /proc/25327/fdinfo/7
pos: 0
flags: 02000002
sigmask: fffffffe7ffbfab7
$ cd /proc/25327/fd
$ ls -al
lr-x------ 1 root root 64 Mar 13 22:28 0 -> /dev/null
lrwx------ 1 root root 64 Mar 13 22:28 1 -> /dev/pts/17
lrwx------ 1 root root 64 Mar 13 22:28 2 -> /dev/pts/17
l-wx------ 1 root root 64 Mar 13 22:28 3 -> /var/log/lxc/3da5764b7bc935896a72abc9371ce68d4d658d8c70b56e1090aacb631080ec0e.log
lrwx------ 1 root root 64 Mar 13 22:28 4 -> socket:[48415]
lrwx------ 1 root root 64 Mar 14 00:03 5 -> /dev/ptmx
lrwx------ 1 root root 64 Mar 14 00:03 6 -> /dev/pts/18
lrwx------ 1 root root 64 Mar 14 00:03 7 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Mar 14 00:03 8 -> anon_inode:[eventpoll]
info about socket:
$ sudo netstat -anp | grep 48415
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ACC ] STREAM LISTENING 48415 25327/lxc-start @/var/lib/lxc/3da5764b7bc935896a72abc9371ce68d4d658d8c70b56e1090aacb631080ec0e/command
there does seem to be a common pattern in the docker.log all containers that do not stop have this signature:
2014/03/16 16:33:15 Container beb71548b3b23ba3337ca30c6c2efcbfcaf19d4638cf3d5ec5b8a3e4c5f1059a failed to exit within 0 seconds of SIGTERM - using the force
2014/03/16 16:33:25 Container SIGKILL failed to exit within 10 seconds of lxc-kill beb71548b3b2 - trying direct SIGKILL
At this point I have no idea what to do next. any suggestions on how I can find out what is causing these containers not exit? Any other data I should collect? I also sent a SIGCHLD to this process with no avail.
more data: added log to end of the node process we start using the start command in the container:
Mon Mar 17 2014 20:52:52 GMT+0000 (UTC) process: main process = exit code: 0
and here are logs from docker:
2014/03/17 20:52:52 Container f8a3d55e0f... failed to exit within 0 seconds of SIGTERM - using the force
2014/03/17 20:53:02 Container SIGKILL failed to exit within 10 seconds of lxc-kill f8a3d55e0fd8 - trying direct SIGKILL
timestamps show process exited @ 20:52:52
This happens using both native and lxc docker drivers.
EDIT: REPRO STEPS!
turn this into a bash script and run and watch almost 50% of the containers turn into zombies!
CNT=0
while true
do
echo $CNT
DOCK=$(sudo docker run -d -t anandkumarpatel/zombie_bug ./node index.js)
sleep 60 && sudo docker stop $DOCK > out.log &
sleep 1
CNT=$(($CNT+1))
if [[ "$CNT" == "50" ]]; then
exit
fi
done
docker rm -f The final option for stopping a running container is to use the --force or -f flag in conjunction with the docker rm command. Typically, docker rm is used to remove an already stopped container, but the use of the -f flag will cause it to first issue a SIGKILL.
To remove one or more Docker containers, use the docker container rm command, followed by the IDs of the containers you want to remove. If you get an error message similar to the one shown below, it means that the container is running.
To stop one or more running Docker containers, you can use the docker stop command. The syntax is simple: $ docker stop [OPTIONS] CONTAINER [CONTAINER...] You can specify one or more containers to stop.
To terminate a container, Docker provides the docker stop and docker kill commands. Both the docker kill and docker stop commands look similar, but their internal execution is different. The docker stop commands issue the SIGTERM signal, whereas the docker kill commands sends the SIGKILL signal.
changing to latest kernel fixes the issue
found exact kernel difference:
REPRO: linux-image-3.8.0-31-generic
NO REPRO: linux-image-3.8.0-32-generic
I think this is the fix:
+++ linux-3.8.0/kernel/pid_namespace.c
@@ -181,6 +181,7 @@
int nr;
int rc;
struct task_struct *task, *me = current;
+ int init_pids = thread_group_leader(me) ? 1 : 2;
/* Don't allow any more processes into the pid namespace */
disable_pid_allocation(pid_ns);
@@ -230,7 +231,7 @@
*/
for (;;) {
set_current_state(TASK_UNINTERRUPTIBLE);
- if (pid_ns->nr_hashed == 1)
+ if (pid_ns->nr_hashed == init_pids)
break;
schedule();
}
which came from here: https://groups.google.com/forum/#!msg/fa.linux.kernel/u4b3n4oYDQ4/GuLrXfDIYggJ
going to upgrade all our servers which repro this and see if it still occurs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With