I'm hosting a Ruby application in a docker container on AWS. Unfortunately this Ruby application is known to leak memory so eventually it consumes all of the available memory.
I'm, perhaps naively, expecting OOM killer to get invoked and kill the Ruby process but nothing happens. Eventually the machine becomes unresponsive (web server doesn't respond, ssh is disabled). We force restart of the machine from the AWS console and get the following in the message the logs, so it is indeed alive at the time of the restart:
Apr 30 23:07:14 ip-10-0-10-24 init: serial (ttyS0) main process (2947) killed by TERM signal
I dont believe that this is resource exhaustion (ie running out of credits) in AWS. If I restart the application periodically the server never goes down.
I'm very much at a loss here; why would memory pressure be causing machines to lock up?
Apparently the solution I provided didn't seem to help the person who asked the question, but it might help someone else who stumbleupon here. The following are the 2 things I suggested which might be causing the problem.
Suggestions 1
I am guessing you are using the offical ruby docker image and when you run the container ruby is running as PID 1
inside the container.
If ruby is running as PID 1
then OOM killer wont be able to kill it, causing all the problem you are seeing.
To solve this problem you will have to make sure a proper init
process runs as PID 1
.
Docker 1.25 and above has the --init
option for docker run
command. This option will make sure that a proper init
handles the tasks of PID 1
, it will also pass all SIGNALs to your ruby application.
https://docs.docker.com/engine/reference/commandline/run/
--init API 1.25+ Run an init inside the container that forwards signals and reaps processes
The following is what docker uses as the init
https://github.com/krallin/tini
Suggestion 2
There is a known issue with Amazon Linux AMI the details can be found at the following link https://github.com/aws/amazon-ecs-agent/issues/794. As of writing I am not sure if the problem with AMI was fixed or not.
So try a different AMI as suggested in that thread say the Ubuntu AMI.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With