Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why might Docker Compose restart be rendering container output an increasing number of times?

I am writing a web app based on Docker Compose, and it has a number of background systems - an HTTP API, a HTTP proxy and a queue. Everything is in PHP in Alpine containers, either PHP 5.6 or 7.0.

I originally set up the queue in Supervisor inside the API container, which worked fine. However Supervisor/Python makes containers much fatter than they should be (80M rather than 25M) so I moved the queue to its own container. It lives for ~5 minutes and quits in order to be restarted, and I was using the auto-restart system in Supervisor, so I have swapped to the restart system in Docker Compose. I am using version 2 of the Compose YAML format.

When the queue starts up, it renders a simple message to stdout:

queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)

This is fine when I do docker-compose up initially. However, for every restart, I get three of these messages, then five, and so on, with no limit. If I do docker ps then it indicates there is only one queue running:

halfer@machine:~/proximate-app$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
a9c94558769d        proximate-app       "/tmp/container-st..."   2 hours ago         Up 2 hours          0.0.0.0:8084->8084/tcp   app_instance
7e48d6aec459        proximate-api       "sh /tmp/bin/web-s..."   2 hours ago         Up 2 hours          0.0.0.0:8080->8080/tcp   api_instance
86c564becadf        proximate-queue     "sh /var/app/bin/c..."   2 hours ago         Up About a minute                            queue_instance
20c2145f80e4        proximate-proxy     "sh /var/proxy/con..."   2 hours ago         Up 2 hours          0.0.0.0:8081->8081/tcp   proxy_instance

Here is my Compose file:

version: '2'
services:

  proximate-app:
    container_name: "app_instance"
    image: proximate-app
    ports:
    - "8084:8084"
    links:
    - proximate-api

  # @todo Remove external ports once everything is happy
  proximate-api:
    container_name: "api_instance"
    image: proximate-api
    ports:
    - "8080:8080"
    links:
    - proximate-proxy
    - proximate-queue
    volumes:
    - proximate-volume:/remote/cache
    - proximate-q-volume:/remote/queue
    # Use time and TZ from the host, could alternatively use env vars and set it
    # manually in the container, see https://wiki.alpinelinux.org/wiki/Setting_the_timezone
    - /etc/localtime:/etc/localtime:ro
    - /etc/timezone:/etc/timezone:ro
    # Should perhaps pass this as a var to docker-compose so as not to hardwire it,
    # but it is fine for now
    environment:
    - PHP_TIMEZONE=Europe/London

  proximate-queue:
    container_name: "queue_instance"
    image: proximate-queue
    restart: always
    links:
    - proximate-proxy
    volumes:
    - proximate-volume:/remote/cache
    - proximate-q-volume:/remote/queue
    environment:
    - PROXY_ADDRESS=proximate-proxy:8081

  # @todo Remove external ports once everything is happy
  proximate-proxy:
    container_name: "proxy_instance"
    image: proximate-proxy
    ports:
    - "8081:8081"
    volumes:
    - proximate-volume:/remote/cache
    environment:
    - PROXY_LOG_PATH=/remote/cache/proxy.log

volumes:
  proximate-volume:
  proximate-q-volume:

The relevant container is proximate-queue.

I am pretty sure that my container itself is not responsible for this oddity. My Dockerfile enters thus:

ENTRYPOINT ["sh", "/var/app/bin/container-start.sh"]

And that just calls a launch script:

#!/bin/sh

php \
    /var/app/bin/queue.php \
    --queue-path /remote/queue \
    --proxy-address ${PROXY_ADDRESS}

Which runs a queue process:

#!/usr/bin/env php
<?php

use Proximate\Service\File;
use Proximate\Service\SiteFetcher as SiteFetcherService;
use Proximate\Queue\Read as QueueReader;

$root = realpath(__DIR__ . '/..');
require_once $root . '/vendor/autoload.php';

$actions = getopt('p:q:', ['proxy-address:', 'queue-path:']);

$queuePath = isset($actions['queue-path']) ? $actions['queue-path'] : (isset($actions['q']) ? $actions['q'] : null);
$proxyAddress = isset($actions['proxy-address']) ? $actions['proxy-address'] : (isset($actions['p']) ? $actions['p'] : null);

if (!$queuePath || !$proxyAddress)
{
    $command = __FILE__;
    die(
        sprintf("Syntax: %s --proxy-address <proxy:port> --queue-path <queue-path>\n", $command)
    );
}

if (!file_exists($queuePath))
{
    die(
        sprintf("Error: the supplied queue path `%s` does not exist\n", $queuePath)
    );
}

echo sprintf(
    "Starting queue watcher (path=%s, proxying to %s)\n",
    $queuePath,
    $proxyAddress
);

$queue = new QueueReader($queuePath, new File());
$queue->
    setFetcher(new SiteFetcherService($proxyAddress))->
    process();

As you can see, the echo sprintf() is what announces the start-up, and there's nothing like loops that could be doing this on my side. Could this be a bug in Docker Compose? I am using docker-compose version 1.9.0, build 2585387 on Ubuntu 14.04.

For reference, the Docker Compose stdout looks like this (the repeated lines from the queue are visible):

halfer@machine:~/proximate-app$ docker-compose up
Creating network "proximateapp_default" with the default driver
Creating proxy_instance
Creating queue_instance
Creating api_instance
Creating app_instance
Attaching to proxy_instance, queue_instance, api_instance, app_instance
proxy_instance     | Teeing proxy logs also to /remote/cache/proxy.log
proxy_instance     | [2017-05-10 09:18:42] stdout.INFO: Setting up queue at `/remote/cache/data` [] []
proxy_instance     | [2017-05-10 09:18:42] stdout.INFO: Starting proxy listener on 172.18.0.2:8081 [] []
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
api_instance       | PHP 7.0.16 Development Server started at Wed May 10 10:19:00 2017
app_instance       | PHP 5.6.29 Development Server started at Wed May 10 09:19:10 2017
app_instance       | PHP 5.6.29 Development Server started at Wed May 10 09:19:10 2017
queue_instance exited with code 0
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance exited with code 0
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)
queue_instance     | Starting queue watcher (path=/remote/queue, proxying to proximate-proxy:8081)

One thing I could try is to get the app to sleep and do nothing else, to prove that some weird exit handler or something is not causing havoc. However, I expect that will do exactly the same thing.

Update

I have replaced the queue with a script that prints time information then sleeps for 20 seconds. This is what I get:

halfer@machine:~/proximate-app$ docker-compose up
Creating network "proximateapp_default" with the default driver
Creating proxy_instance
Creating queue_instance
Creating api_instance
Creating app_instance
Attaching to proxy_instance, queue_instance, api_instance, app_instance
proxy_instance     | Teeing proxy logs also to /remote/cache/proxy.log
proxy_instance     | [2017-05-10 11:51:17] stdout.INFO: Setting up queue at `/remote/cache/data` [] []
proxy_instance     | [2017-05-10 11:51:17] stdout.INFO: Starting proxy listener on 172.18.0.2:8081 [] []
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
api_instance       | PHP 7.0.16 Development Server started at Wed May 10 12:51:37 2017
app_instance       | PHP 5.6.29 Development Server started at Wed May 10 11:51:46 2017
app_instance       | PHP 5.6.29 Development Server started at Wed May 10 11:51:46 2017
queue_instance exited with code 0
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:51:55 +0000. Microtime=1494417115.178871
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:52:22 +0000. Microtime=1494417142.409513
queue_instance exited with code 0
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:51:27 +0000. Microtime=1494417087.107185
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:51:55 +0000. Microtime=1494417115.178871
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:52:22 +0000. Microtime=1494417142.409513
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:52:49 +0000. Microtime=1494417169.612523
queue_instance     | Hello everyone! Time=Wed, 10 May 2017 11:53:17 +0000. Microtime=1494417197.826749

That seems to show that:

  • Restarts are reported only every other restart
  • My task will take exactly 20 seconds, but Compose is restarting them quite sluggishly (closer to every 30 seconds). That doesn't bother me too much, but may be a useful observation
  • Where lines are being repeated, they are in fact old restart reports.

What is going on?

like image 733
halfer Avatar asked May 10 '17 11:05

halfer


People also ask

What happens when you restart a docker container?

docker restart does two things: It does the equivalent of docker stop . It sends SIGTERM to its primary process (only); if that doesn't terminate within 10 seconds, it sends SIGKILL. If the primary process still has children, they also get forcibly terminated.

Does Docker compose up restart container?

The docker compose start command is useful only to restart containers that were previously created, but were stopped. It never creates new containers.

Does Docker compose restart failed containers?

no: Containers won't restart automatically. on-failure[:max-retries]: Restart the container if it exits with a non-zero exit code, and provide a maximum number of attempts for the Docker daemon to restart the container.

How do I stop my docker container from automatically restarting?

You can use the --restart=unless-stopped option, as @Shibashis mentioned, or update the restart policy (this requires docker 1.11 or newer); See the documentation for docker update and Docker restart policies. Use docker update --restart=no $(docker ps -a -q) to update all your containers :-) Great answer!!


2 Answers

I sum up what we discovered.

The cumulative output is due to the container is not being removed each time it exits. So, docker-compose starts it again showing you the docker logs -f from the same container that is starting again. That is working as expected.

I didn't find a way to get docker-compose removing a container before start it again inside the same docker-compose up

I like your queue-based architecture, but I think that it would be better if you use a long-lived container instead of one container that starts/exits/starts/exits... You can accomplish it doing a simple while true; do ./queue-app-blabla.php; sleep 20; done in a shell script.

Anyone that need to reproduce this, can use the following docker-compose.yml

version: "2"

services:
  ui:
    image: ubuntu
    command: tail -f /dev/null
    links:
      - queue
  queue:
    image: ubuntu
    command: sh -c "date; echo Sleeping...; sleep 20; echo exit."
    restart: always

Output:

queue_1  | Mon May 15 04:32:12 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
docker_queue_1 exited with code 0
queue_1  | Mon May 15 04:32:12 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:32:33 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:32:54 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
docker_queue_1 exited with code 0
queue_1  | Mon May 15 04:32:12 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:32:33 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:32:54 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:33:17 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.
queue_1  | Mon May 15 04:33:38 UTC 2017
queue_1  | Sleeping...
queue_1  | exit.

My proposed solution:

version: "2"

services:
  ui:
    image: ubuntu
    command: tail -f /dev/null
    links:
      - queue
  queue:
    image: ubuntu
    command: sh -c "while true; do date; echo Sleeping...; sleep 20; echo loop.; done"
    restart: always

Output:

Attaching to docker_queue_1, docker_ui_1
queue_1  | Mon May 15 04:36:16 UTC 2017
queue_1  | Sleeping...
queue_1  | loop.
queue_1  | Mon May 15 04:36:36 UTC 2017
queue_1  | Sleeping...
queue_1  | loop.
queue_1  | Mon May 15 04:36:56 UTC 2017
queue_1  | Sleeping...
queue_1  | loop.
queue_1  | Mon May 15 04:37:16 UTC 2017
queue_1  | Sleeping...

IMO, it's better to see containers start and work, instead of flapping start/exit containers. Docker has an special treatment to containers that have this behaviour, see docker docs. "An ever increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server." (but it does not apply in your case because your container is living > 10s.


Edit:

To check a possible error and exit. This could be useful:

#!/bin/bash

PERIOD=10 # seconds

while true; do
  echo Starting process...

  date # the process
  STATUS=$?

  if [ $STATUS -ne 0 ]; then
        echo "Error found. Exiting" >&2
        exit $STATUS
  fi

  echo End process. Sleeping ${PERIOD}s
  sleep $PERIOD
done

Output:

▶ ./script.sh
Starting process...
Mon May 15 11:43:06 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:16 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:26 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:36 ART 2017
End process. Sleeping 10s
Starting process...
Mon May 15 11:43:46 ART 2017
End process. Sleeping 10s
like image 86
Robert Avatar answered Oct 27 '22 05:10

Robert


Using the Docker restart functionality just to restart what should be a service (long running queue task) is not necessary at all as the operating system comes with everything you need.

Before making use of the docker restart functionality you restarted the queue your own. But you wrote you retired that Supervisor/Python because it is too fat (~55M) in your eyes.

So I read it that you want to keep the footprint low.

Let's do that. So my proposal is, start at the bare minimum, the operating system. It should have everything you need for your queue process management as that is what is common for operating systems: invoke and control other processes (to keep the list concise).

This is even true for the Alpine Linux PHP images with Busybox with a reduced set of commands and shells (e.g. no Bash).

So I assume you want to kick the PHP script every 5 minutes and then start it again in my example (without timeout it is possible as well, even with less code).

The docker image ships with the timeout command that does this. You can make use of it in the Dockerfile entrypoint sh script. So PHP is invoked and terminated after 300 seconds (five minutes):

# executed by sh command from BusyBox v1.24.2 (2017-01-18 14:13:46 GMT) multi-call binary
#
# usage: ENTRYPOINT ["sh", "docker-entrypoint.sh"]
set -eu

timed_queue() {
  set +e
  timeout \
    -t 300 \
    php -d max_execution_time=301 \
      -f /queue.php -- \
      --proxy-address "${PROXY_ADDRESS}" \
      --queue-path /remote/queue \
    ;
  retval=$?
  set -e
}

timed_queue
while [ $retval -eq 143 -o $retval -eq 0 ]; do
  timed_queue
done

The setting to the settings is done with the -t 300 command-line parameter to timeout. The exit status is 143 by default (SIGTERM) in case the timeout kicks in.

Your queue script returns 0 in case it ends successfully. In both cases the script is started again (with the timeout).

This requires an additional small change to your queue script. You make use of the die command in the following fashion to signal errors:

die(
    sprintf("Error: the supplied queue path `%s` does not exist\n", $queuePath)
);

Instead output the error message to standard error output and exit with a non-zero status code (e.g. 1):

fprintf(STDERR, "Error: the supplied queue path `%s` does not exist\n", $queuePath);
die(1);

This will make the script compatible with the operating system. On the command-line an exit status of zero (0) signals OK (that is using die("string") with a string message as in your example) but a non-zero code shows error (for starters use 1).

Done this, you will get the functionality of a re-starting and timeout-ing script with practically no overhead. That is both in file-size but as well in processing time. I've put this to a test with a timeout of one second and it shows only a very little overhead of some microseconds when restarted:

queue_instance     | info: 2017-05-19T14:00:24.34824700 queue.php started...
queue_instance     | info: 2017-05-19T14:00:25.35548200 queue.php started...
queue_instance     | info: 2017-05-19T14:00:26.34564400 queue.php started...
queue_instance     | info: 2017-05-19T14:00:27.35868700 queue.php started...
queue_instance     | info: 2017-05-19T14:00:28.34597300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:29.34139800 queue.php started...
queue_instance     | info: 2017-05-19T14:00:30.26049500 queue.php started...
queue_instance     | info: 2017-05-19T14:00:31.26174500 queue.php started...
queue_instance     | info: 2017-05-19T14:00:32.26322800 queue.php started...
queue_instance     | info: 2017-05-19T14:00:33.26352800 queue.php started...
queue_instance     | info: 2017-05-19T14:00:34.26533300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:35.26524300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:36.26743300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:37.26889500 queue.php started...
queue_instance     | info: 2017-05-19T14:00:38.27222300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:39.27209000 queue.php started...
queue_instance     | info: 2017-05-19T14:00:40.27620500 queue.php started...
queue_instance     | info: 2017-05-19T14:00:41.27985300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:42.28136100 queue.php started...
queue_instance     | info: 2017-05-19T14:00:43.28252200 queue.php started...
queue_instance     | info: 2017-05-19T14:00:44.28403600 queue.php started...
queue_instance     | info: 2017-05-19T14:00:45.28595300 queue.php started...
queue_instance     | info: 2017-05-19T14:00:46.28683900 queue.php started...
queue_instance     | info: 2017-05-19T14:00:47.28803800 queue.php started...

The subtle differences can most likely be ignored in your scenario as you want to run the script for five minutes.

In case the process crashes (fatal errors etc.) PHP will have an exit code of 127 so you will get notice as the container shuts down then. Same if there is a mistake in the Dockerfile entrypoint script, like an undefined environment variable.

So if I understood you right, this should be what you're looking for.

Depending on what is finally invoked via CMD or ENTRYPOINT, more process handling might apply. Or more handling might apply further. Some references:

  • Alternative to supervisord for docker
  • How critical is dumb-init for Docker?
  • su-exec (plain c), gosu (in go, MBs more)
  • run-parts (Busybox)
like image 20
hakre Avatar answered Oct 27 '22 06:10

hakre