I am writing a web app based on Docker Compose, and it has a number of background systems - an HTTP API, a HTTP proxy and a queue. Everything is in PHP in Alpine containers, either PHP 5.6 or 7.0.
I originally set up the queue in Supervisor inside the API container, which worked fine. However Supervisor/Python makes containers much fatter than they should be (80M rather than 25M) so I moved the queue to its own container. It lives for ~5 minutes and quits in order to be restarted, and I was using the auto-restart system in Supervisor, so I have swapped to the restart system in Docker Compose. I am using version 2 of the Compose YAML format.
When the queue starts up, it renders a simple message to stdout:
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
This is fine when I do docker-compose up
initially. However, for every restart, I get three of these messages, then five, and so on, with no limit. If I do docker ps
then it indicates there is only one queue running:
halfer@machine:~/proximate-app$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a9c94558769d proximate-app "/tmp/container-st..." 2 hours ago Up 2 hours 0.0.0.0:8084->8084/tcp app_instance
7e48d6aec459 proximate-api "sh /tmp/bin/web-s..." 2 hours ago Up 2 hours 0.0.0.0:8080->8080/tcp api_instance
86c564becadf proximate-queue "sh /var/app/bin/c..." 2 hours ago Up About a minute queue_instance
20c2145f80e4 proximate-proxy "sh /var/proxy/con..." 2 hours ago Up 2 hours 0.0.0.0:8081->8081/tcp proxy_instance
Here is my Compose file:
version: '2'
services:
proximate-app:
container_name: "app_instance"
image: proximate-app
ports:
- "8084:8084"
links:
- proximate-api
# @todo Remove external ports once everything is happy
proximate-api:
container_name: "api_instance"
image: proximate-api
ports:
- "8080:8080"
links:
- proximate-proxy
- proximate-queue
volumes:
- proximate-volume:/remote/cache
- proximate-q-volume:/remote/queue
# Use time and TZ from the host, could alternatively use env vars and set it
# manually in the container, see https://wiki.alpinelinux.org/wiki/Setting_the_timezone
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
# Should perhaps pass this as a var to docker-compose so as not to hardwire it,
# but it is fine for now
environment:
- PHP_TIMEZONE=Europe/London
proximate-queue:
container_name: "queue_instance"
image: proximate-queue
restart: always
links:
- proximate-proxy
volumes:
- proximate-volume:/remote/cache
- proximate-q-volume:/remote/queue
environment:
- PROXY_ADDRESS=proximate-proxy:8081
# @todo Remove external ports once everything is happy
proximate-proxy:
container_name: "proxy_instance"
image: proximate-proxy
ports:
- "8081:8081"
volumes:
- proximate-volume:/remote/cache
environment:
- PROXY_LOG_PATH=/remote/cache/proxy.log
volumes:
proximate-volume:
proximate-q-volume:
The relevant container is proximate-queue
.
I am pretty sure that my container itself is not responsible for this oddity. My Dockerfile
enters thus:
ENTRYPOINT ["sh", "/var/app/bin/container-start.sh"]
And that just calls a launch script:
#!/bin/sh
php \
/var/app/bin/queue.php \
--queue-path /remote/queue \
--proxy-address ${PROXY_ADDRESS}
Which runs a queue process:
#!/usr/bin/env php
<?php
use Proximate\Service\File;
use Proximate\Service\SiteFetcher as SiteFetcherService;
use Proximate\Queue\Read as QueueReader;
$root = realpath(__DIR__ . '/..');
require_once $root . '/vendor/autoload.php';
$actions = getopt('p:q:', ['proxy-address:', 'queue-path:']);
$queuePath = isset($actions['queue-path']) ? $actions['queue-path'] : (isset($actions['q']) ? $actions['q'] : null);
$proxyAddress = isset($actions['proxy-address']) ? $actions['proxy-address'] : (isset($actions['p']) ? $actions['p'] : null);
if (!$queuePath || !$proxyAddress)
{
$command = __FILE__;
die(
sprintf("Syntax: %s --proxy-address <proxy:port> --queue-path <queue-path>\n", $command)
);
}
if (!file_exists($queuePath))
{
die(
sprintf("Error: the supplied queue path `%s` does not exist\n", $queuePath)
);
}
echo sprintf(
"Starting queue watcher (path=%s, proxying to %s)\n",
$queuePath,
$proxyAddress
);
$queue = new QueueReader($queuePath, new File());
$queue->
setFetcher(new SiteFetcherService($proxyAddress))->
process();
As you can see, the echo sprintf()
is what announces the start-up, and there's nothing like loops that could be doing this on my side. Could this be a bug in Docker Compose? I am using docker-compose version 1.9.0, build 2585387
on Ubuntu 14.04.
For reference, the Docker Compose stdout looks like this (the repeated lines from the queue are visible):
halfer@machine:~/proximate-app$ docker-compose up
Creating network "proximateapp_default" with the default driver
Creating proxy_instance
Creating queue_instance
Creating api_instance
Creating app_instance
Attaching to proxy_instance, queue_instance, api_instance, app_instance
proxy_instance | Teeing proxy logs also to /remote/cache/proxy.log
proxy_instance | [2017-05-10 09:18:42] stdout.INFO: Setting up queue at `/remote/cache/data` [] []
proxy_instance | [2017-05-10 09:18:42] stdout.INFO: Starting proxy listener on 172.18.0.2:8081 [] []
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
api_instance | PHP 7.0.16 Development Server started at Wed May 10 10:19:00 2017
app_instance | PHP 5.6.29 Development Server started at Wed May 10 09:19:10 2017
app_instance | PHP 5.6.29 Development Server started at Wed May 10 09:19:10 2017
queue_instance exited with code 0
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance exited with code 0
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
One thing I could try is to get the app to sleep and do nothing else, to prove that some weird exit handler or something is not causing havoc. However, I expect that will do exactly the same thing.
I have replaced the queue with a script that prints time information then sleeps for 20 seconds. This is what I get:
halfer@machine:~/proximate-app$ docker-compose up
Creating network "proximateapp_default" with the default driver
Creating proxy_instance
Creating queue_instance
Creating api_instance
Creating app_instance
Attaching to proxy_instance, queue_instance, api_instance, app_instance
proxy_instance | Teeing proxy logs also to /remote/cache/proxy.log
proxy_instance | [2017-05-10 11:51:17] stdout.INFO: Setting up queue at `/remote/cache/data` [] []
proxy_instance | [2017-05-10 11:51:17] stdout.INFO: Starting proxy listener on 172.18.0.2:8081 [] []
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
api_instance | PHP 7.0.16 Development Server started at Wed May 10 12:51:37 2017
app_instance | PHP 5.6.29 Development Server started at Wed May 10 11:51:46 2017
app_instance | PHP 5.6.29 Development Server started at Wed May 10 11:51:46 2017
queue_instance exited with code 0
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:51:55 +0000. Microtime=1494417115.178871
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:52:22 +0000. Microtime=1494417142.409513
queue_instance exited with code 0
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:51:55 +0000. Microtime=1494417115.178871
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:52:22 +0000. Microtime=1494417142.409513
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:52:49 +0000. Microtime=1494417169.612523
queue_instance | Hello everyone! Time=Wed, 10 May 2017 11:53:17 +0000. Microtime=1494417197.826749
That seems to show that:
What is going on?
docker restart does two things: It does the equivalent of docker stop . It sends SIGTERM to its primary process (only); if that doesn't terminate within 10 seconds, it sends SIGKILL. If the primary process still has children, they also get forcibly terminated.
The docker compose start command is useful only to restart containers that were previously created, but were stopped. It never creates new containers.
no: Containers won't restart automatically. on-failure[:max-retries]: Restart the container if it exits with a non-zero exit code, and provide a maximum number of attempts for the Docker daemon to restart the container.
You can use the --restart=unless-stopped option, as @Shibashis mentioned, or update the restart policy (this requires docker 1.11 or newer); See the documentation for docker update and Docker restart policies. Use docker update --restart=no $(docker ps -a -q) to update all your containers :-) Great answer!!
I sum up what we discovered.
The cumulative output is due to the container is not being removed each time it exits. So, docker-compose starts it again showing you the docker logs -f
from the same container that is starting again. That is working as expected.
I didn't find a way to get docker-compose removing a container before start it again inside the same docker-compose up
I like your queue-based architecture, but I think that it would be better if you use a long-lived container instead of one container that starts/exits/starts/exits... You can accomplish it doing a simple while true; do ./queue-app-blabla.php; sleep 20; done
in a shell script.
Anyone that need to reproduce this, can use the following docker-compose.yml
version: "2"
services:
ui:
image: ubuntu
command: tail -f /dev/null
links:
- queue
queue:
image: ubuntu
command: sh -c "date; echo Sleeping...; sleep 20; echo exit."
restart: always
Output:
queue_1 | Mon May 15 04:32:12 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
docker_queue_1 exited with code 0
queue_1 | Mon May 15 04:32:12 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:32:33 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:32:54 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
docker_queue_1 exited with code 0
queue_1 | Mon May 15 04:32:12 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:32:33 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:32:54 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:33:17 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
queue_1 | Mon May 15 04:33:38 UTC 2017
queue_1 | Sleeping...
queue_1 | exit.
My proposed solution:
version: "2"
services:
ui:
image: ubuntu
command: tail -f /dev/null
links:
- queue
queue:
image: ubuntu
command: sh -c "while true; do date; echo Sleeping...; sleep 20; echo loop.; done"
restart: always
Output:
Attaching to docker_queue_1, docker_ui_1
queue_1 | Mon May 15 04:36:16 UTC 2017
queue_1 | Sleeping...
queue_1 | loop.
queue_1 | Mon May 15 04:36:36 UTC 2017
queue_1 | Sleeping...
queue_1 | loop.
queue_1 | Mon May 15 04:36:56 UTC 2017
queue_1 | Sleeping...
queue_1 | loop.
queue_1 | Mon May 15 04:37:16 UTC 2017
queue_1 | Sleeping...
IMO, it's better to see containers start and work, instead of flapping start/exit containers. Docker has an special treatment to containers that have this behaviour, see docker docs. "An ever increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server." (but it does not apply in your case because your container is living > 10s.
Edit:
To check a possible error and exit. This could be useful:
#!/bin/bash
PERIOD=10 # seconds
while true; do
echo Starting process...
date # the process
STATUS=$?
if [ $STATUS -ne 0 ]; then
echo "Error found. Exiting" >&2
exit $STATUS
fi
echo End process. Sleeping ${PERIOD}s
sleep $PERIOD
done
Output:
▶ ./script.sh
Starting process...
Mon May 15 11:43:06 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:16 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:26 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:36 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:46 ART 2017
End process. Sleeping 10s
Using the Docker restart functionality just to restart what should be a service (long running queue task) is not necessary at all as the operating system comes with everything you need.
Before making use of the docker restart functionality you restarted the queue your own. But you wrote you retired that Supervisor/Python because it is too fat (~55M) in your eyes.
So I read it that you want to keep the footprint low.
Let's do that. So my proposal is, start at the bare minimum, the operating system. It should have everything you need for your queue process management as that is what is common for operating systems: invoke and control other processes (to keep the list concise).
This is even true for the Alpine Linux PHP images with Busybox with a reduced set of commands and shells (e.g. no Bash).
So I assume you want to kick the PHP script every 5 minutes and then start it again in my example (without timeout it is possible as well, even with less code).
The docker image ships with the timeout command that does this. You can make use of it in the Dockerfile entrypoint sh script. So PHP is invoked and terminated after 300 seconds (five minutes):
# executed by sh command from BusyBox v1.24.2 (2017-01-18 14:13:46 GMT) multi-call binary
#
# usage: ENTRYPOINT ["sh", "docker-entrypoint.sh"]
set -eu
timed_queue() {
set +e
timeout \
-t 300 \
php -d max_execution_time=301 \
-f /queue.php -- \
--proxy-address "${PROXY_ADDRESS}" \
--queue-path /remote/queue \
;
retval=$?
set -e
}
timed_queue
while [ $retval -eq 143 -o $retval -eq 0 ]; do
timed_queue
done
The setting to the settings is done with the -t 300
command-line parameter to timeout
. The exit status is 143 by default (SIGTERM) in case the timeout kicks in.
Your queue script returns 0 in case it ends successfully. In both cases the script is started again (with the timeout).
This requires an additional small change to your queue script. You make use of the die command in the following fashion to signal errors:
die(
sprintf("Error: the supplied queue path `%s` does not exist\n", $queuePath)
);
Instead output the error message to standard error output and exit with a non-zero status code (e.g. 1):
fprintf(STDERR, "Error: the supplied queue path `%s` does not exist\n", $queuePath);
die(1);
This will make the script compatible with the operating system. On the command-line an exit status of zero (0) signals OK (that is using die("string")
with a string message as in your example) but a non-zero code shows error (for starters use 1).
Done this, you will get the functionality of a re-starting and timeout-ing script with practically no overhead. That is both in file-size but as well in processing time. I've put this to a test with a timeout of one second and it shows only a very little overhead of some microseconds when restarted:
queue_instance | info: 2017-05-19T14:00:24.34824700 queue.php started...
queue_instance | info: 2017-05-19T14:00:25.35548200 queue.php started...
queue_instance | info: 2017-05-19T14:00:26.34564400 queue.php started...
queue_instance | info: 2017-05-19T14:00:27.35868700 queue.php started...
queue_instance | info: 2017-05-19T14:00:28.34597300 queue.php started...
queue_instance | info: 2017-05-19T14:00:29.34139800 queue.php started...
queue_instance | info: 2017-05-19T14:00:30.26049500 queue.php started...
queue_instance | info: 2017-05-19T14:00:31.26174500 queue.php started...
queue_instance | info: 2017-05-19T14:00:32.26322800 queue.php started...
queue_instance | info: 2017-05-19T14:00:33.26352800 queue.php started...
queue_instance | info: 2017-05-19T14:00:34.26533300 queue.php started...
queue_instance | info: 2017-05-19T14:00:35.26524300 queue.php started...
queue_instance | info: 2017-05-19T14:00:36.26743300 queue.php started...
queue_instance | info: 2017-05-19T14:00:37.26889500 queue.php started...
queue_instance | info: 2017-05-19T14:00:38.27222300 queue.php started...
queue_instance | info: 2017-05-19T14:00:39.27209000 queue.php started...
queue_instance | info: 2017-05-19T14:00:40.27620500 queue.php started...
queue_instance | info: 2017-05-19T14:00:41.27985300 queue.php started...
queue_instance | info: 2017-05-19T14:00:42.28136100 queue.php started...
queue_instance | info: 2017-05-19T14:00:43.28252200 queue.php started...
queue_instance | info: 2017-05-19T14:00:44.28403600 queue.php started...
queue_instance | info: 2017-05-19T14:00:45.28595300 queue.php started...
queue_instance | info: 2017-05-19T14:00:46.28683900 queue.php started...
queue_instance | info: 2017-05-19T14:00:47.28803800 queue.php started...
The subtle differences can most likely be ignored in your scenario as you want to run the script for five minutes.
In case the process crashes (fatal errors etc.) PHP will have an exit code of 127 so you will get notice as the container shuts down then. Same if there is a mistake in the Dockerfile entrypoint script, like an undefined environment variable.
So if I understood you right, this should be what you're looking for.
Depending on what is finally invoked via CMD
or ENTRYPOINT
, more process handling might apply. Or more handling might apply further. Some references:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With