In our project we inherited Docker environment with some service stack in it.
I've noticed Docker restarting stack once it faces memory limit.
Unfortunately, I haven't found any info according to my questions on the Docker's website, so I'm asking here:
You can use the --restart=unless-stopped option, as @Shibashis mentioned, or update the restart policy (this requires docker 1.11 or newer); See the documentation for docker update and Docker restart policies. Use docker update --restart=no $(docker ps -a -q) to update all your containers :-) Great answer!!
You can change the restart policy of an existing container using docker update . Pass the name of the container to the command. You can find container names by running docker ps -a . You can use docker update with containers that are running or stopped.
use sudo docker update --restart=no <container_id> to update --restart flag of the container. Now you can stop the container.
Docker always attempts to restart the container when the container exits. --restart=no. Docker does not attempt to restart the container when the container exits. This is the default policy.
- Is this behaviour configurable? For instance, I don't want Docker to restart my stack under any circumstances. If it is configurable, then how?
With a version 3 stack, the restart policy moved to the deploy section:
version: '3'
services:
crash:
image: busybox
command: sleep 10
deploy:
restart_policy:
condition: none
# max_attempts: 2
Documentation on this is available at: https://docs.docker.com/compose/compose-file/#restart_policy
- Is there any docker journal to keep any stack restarts as it's entries?
Depending on the task history limit (configurable with docker swarm update
, you can view the previously run tasks for a service:
$ docker service ps restart_crash
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
30okge1sjfno restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 4 minutes ago
papxoq1vve1a \_ restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 4 minutes ago
1hji2oko51sk \_ restart_crash.1 busybox:latest bmitch-asusr556l Shutdown Complete 5 minutes ago
And you can inspect the state for any one task:
$ docker inspect 30okge1sjfno --format '{{json .Status}}' | jq .
{
"Timestamp": "2018-11-06T19:55:02.208633174Z",
"State": "complete",
"Message": "finished",
"ContainerStatus": {
"ContainerID": "8e9310bde9acc757f94a56a32c37a08efeed8a040ce98d84c851d4eef0afc545",
"PID": 0,
"ExitCode": 0
},
"PortStatus": {}
}
There's also an event history in the docker engine that you can query:
$ docker events --filter label=com.docker.swarm.service.name=restart_crash --filter event=die --since 15m --until 0s
2018-11-06T14:54:09.417465313-05:00 container die f17d945b249a04e716155bcc6d7db490e58e5be00973b0470b05629ce2cca461 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=1hji2oko51skhv8fv1nw71gb8, com.docker.swarm.task.name=restart_crash.1.1hji2oko51skhv8fv1nw71gb8, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.1hji2oko51skhv8fv1nw71gb8)
2018-11-06T14:54:32.391165964-05:00 container die d6f98b8aaa171ca8a2ddaf31cce7a1e6f1436ba14696ea3842177b2e5e525f13 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=papxoq1vve1adriw6e9xqdaad, com.docker.swarm.task.name=restart_crash.1.papxoq1vve1adriw6e9xqdaad, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.papxoq1vve1adriw6e9xqdaad)
2018-11-06T14:55:00.126450155-05:00 container die 8e9310bde9acc757f94a56a32c37a08efeed8a040ce98d84c851d4eef0afc545 (com.docker.stack.namespace=restart, com.docker.swarm.node.id=q44zx0s2lvu1fdduk800e5ini, com.docker.swarm.service.id=uqirm6a8dix8c2n50thmpzj06, com.docker.swarm.service.name=restart_crash, com.docker.swarm.task=, com.docker.swarm.task.id=30okge1sjfnoicd0lo2g1y0o7, com.docker.swarm.task.name=restart_crash.1.30okge1sjfnoicd0lo2g1y0o7, exitCode=0, image=busybox:latest@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812, name=restart_crash.1.30okge1sjfnoicd0lo2g1y0o7)
See more details on the events command at: https://docs.docker.com/engine/reference/commandline/events/
The best practice at larger scale organizations is to send the container logs to a central location (e.g. Elastic) and monitor the metrics externally (e.g. Prometheus/Grafana).
Since you haven't added any configuration snippet or runtime commands to your post, I'll have to make hypothesis on your actual question.
My assumptions :
I assume your docker-compose.yml looks like the following:
version: '2.1'
services:
service1:
image: some/image
restart: always
mem_limit: 512m
service2:
image: another/image
restart: always
mem_limit: 512m
With this configuration, any of the service containers would be OOM-Killed by the kernel when it tries to use more than 512Mb of memory. Docker would then automatically restart a fresh container to replace the killed one.
So to answer your 1st point : yes, it is, just change "restart" to "no", or simply remove this line (since "no" is the default value for this parameter). As for your second point, simply look for service restarts in the docker daemon logs.
Yet, if what you need is to keep your service up, this is not going to help you : your service will still try to use more than its allowed memory limit, it will still get killed, ... and not be automatically restarted anymore.
It would be better to review the memory usage pattern of your services, and understand why they are attempting to use more than the configured limit. Eventually, the solution is either to configure your services to use less memory, or raise the mem_limit
in your docker-compose.yml.
For example :
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
.I hope this will help you; to be more precise I would really need more context.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With