Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS beanstalk docker exception: `shim reaped`

I am currently deploying in production a application (nodejs websocket server) on AWS beanstalk, using the docker environment.

Periodically, the containers 'crash' (actually the main process in the container restart), and I can't figure out why. /var/log/docker contains these logs (at the exact moment the incident happen):

time="2018-12-07T00:48:46Z" level=info msg="shim reaped" id=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f 
time="2018-12-07T00:48:46.052832134Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2018-12-07T00:48:46Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f/shim.sock" debug=false pid=9192

CPU and RAM seems ok at that moment. Does someone have a clue?

Edit There is also other logs, but I suspect it's the consequence:

/var/log/nginx/error.log:

2018/12/07 00:48:45 [error] 4268#0: *10397 recv() failed (104: Connection reset by peer) while proxying upgraded connection, client: 172.31.43.209, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."
2018/12/07 00:48:45 [error] 4268#0: *1009 recv() failed (104: Connection reset by peer) while proxying upgraded connection, client: 172.31.43.209, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."
2018/12/07 00:48:46 [error] 4267#0: *11092 connect() failed (111: Connection refused) while connecting to upstream, client: 172.31.12.149, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."

/var/log/docker-events.log

2018-12-07T00:48:46.052880449Z container die 0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f (exitCode=1, image=2fc4abcada2b, name=inspiring_euler)
2018-12-07T00:48:46.176330610Z network disconnect 94c449d445a5a434af70517a1c8734c540c5c1f9ddbbc1a53a002f25dbc7f581 (container=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f, name=bridge, type=bridge)
2018-12-07T00:48:46.626514590Z network connect 94c449d445a5a434af70517a1c8734c540c5c1f9ddbbc1a53a002f25dbc7f581 (container=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f, name=bridge, type=bridge)
2018-12-07T00:48:46.869988171Z container start 0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f (image=2fc4abcada2b, name=inspiring_euler)
like image 764
p9f Avatar asked Dec 07 '18 14:12

p9f


1 Answers

This failure may be due to the containerd running on a system with THP (transparent huge pages) enabled. The memory mangement scheme doesn't align with your container's memory allocation pattern causing the failure. A similar issue was reported at https://github.com/containerd/containerd/issues/2202

Unfortunately, you cannot tune the kernel settings for Elastic Beanstalk hosts to resolve this problem. The solution is documented for mongodb as it has similar issues with THP.

https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/

like image 189
Oxidizing1 Avatar answered Nov 15 '22 23:11

Oxidizing1