Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery multi inside docker container

I have python app with celery in docker containers. I want have few workers with different queue. For example:

celery worker -c 3 -Q queue1
celery worker -c 7 -Q queue2,queue3

But I don't do this in docker compose. I found out about celery multi. I tried use it.

version: '3.2'
services:
  app:
    image: "app"
    build:
      context: .
    networks:
      - net
    ports:
      - 5004:5000
    stdin_open: true
    tty: true
    environment:
      FLASK_APP: app/app.py
      FLASK_DEBUG: 1
    volumes:
      - .:/home/app
  app__celery:
    image: "app"
    build:
      context: .
    command: sh -c 'celery multi start 2 -l INFO -c:1 3 -c:2 7 -Q:1 queue1 -Q:2 queue2,queue3'

But I get it...

app__celery_1  |    > celery1@1ab37081acb9: OK
app__celery_1  |    > celery2@1ab37081acb9: OK
app__celery_1 exited with code 0

And my container with celery closes. How not to let him close and get his logs from him?

UPD: Celery multi created background processes. How to start celery multi in foreground?

like image 238
dluhhbiu Avatar asked Feb 06 '18 15:02

dluhhbiu


People also ask

Can you run multiple apps in a Docker container?

It's ok to have multiple processes, but to get the most benefit out of Docker, avoid one container being responsible for multiple aspects of your overall application. You can connect multiple containers using user-defined networks and shared volumes.

Can a Docker container have multiple Entrypoints?

But since Docker allows only a single ENTRYPOINT (to be precise, only the last ENTRYPOINT in the Dockerfile has an effect), you need to find a way to run multiple processes (the tunnel and the application) with a single command.

Can a Docker container have multiple networks?

You can create multiple networks with Docker and add containers to one or more networks. Containers can communicate within networks but not across networks. A container with attachments to multiple networks can connect with all of the containers on all of those networks.

Can celery run multiple workers?

Not only CAN Celery run more than one worker, that is in fact the very point, and reason Celery even exists and it's whole job is to manage not just multiple workers but conceivably across machines and distributed.


3 Answers

Depending on your application needs and design, you may actually want to separate the workers in different container for different tasks.

However, if there's low resource usage and it makes sense to combine multiple workers in a single container, you can do it via an entrypoint script.

Edit 2019-12-05: After running this for a while. It's not a good idea for production use. 2 caveats:

  1. There is a risk of background worker silently exiting but not captured in the foreground. The tail -f will continue to run, but docker will not know the background worker stopped. Depending on your celery debug level settings, the logs may show some indication, but it's unknown to docker when you do docker ps. To be reliable, the workers need to restart on failure, which brings us to the suggestions of using supervisord.

  2. As a container is started and stopped (but not removed) the docker container state is kept. This means that if your celery workers does depend on a pidfile for identification, and yet there is an ungraceful shutdown, there is a chance that the pidfile is kept, and the worker will not restart cleanly even with a docker stop; docker start. This is due to celery startup detecting the existence of the leftover PIDfile from the previous unclean shutdown. To prevent multiple instances, the restarted worker stops itself with "PIDfile found, celery is already running?". The whole container must be removed with a docker rm, or docker-compose down; docker-compose up. A few ways of dealing with this:

    a. the container must be run with --rm flag to remove container once the container is stopped.

    b. perhaps not including the --pidfile parameter in the celery multi or celery worker command would work better.

Summary Recommendation: It is probably better to use supervisord.

Now, on to the details:

Docker containers need a foreground task to be running, or the container will exit. This will be addressed further down.

In addition, celery workers may run long-running tasks, and need to respond to docker's shutdown (SIGTERM) signal to gracefully shutdown i.e finish up long-running tasks before shutdown or restart.

To achieve docker signal propagation and handling, it is best to declare the entrypoint within a dockerfile in docker's exec form, you may also do this in docker-compose file

In addition, since celery multi works in the background, docker can't see any logs. You'll need to be able to show the logs in the foreground to let docker logs be able to see what is happening. We'll do this by setting logfile for the celery multi workers and display in console foreground with tail -f <logfile_pattern> to run indefinitely.

We need to achieve three objectives:

  1. Run the docker container with a foreground task
  2. Receive, trap, and handle docker shutdown signals
  3. Shutdown the workers gracefully

For #1, we will run tail -f & and then wait on it as the foreground task.

For #2, this is achieved by setting the trap function, and trapping the signal. To receive and handle signals with the trap function, wait have to be the running foreground task, achieved in #1.

For #3, we will run celery multi stop <number_of_workers_in_start_command> and other argument parameters during startup in celery multi start.

Here's the gist I have wrote, copied here:

#!/bin/sh

# safety switch, exit script if there's error. Full command of shortcut `set -e`
set -o errexit
# safety switch, uninitialized variables will stop script. Full command of shortcut `set -u`
set -o nounset

# tear down function
teardown()
{
    echo " Signal caught..."
    echo "Stopping celery multi gracefully..."

    # send shutdown signal to celery workser via `celery multi`
    # command must mirror some of `celery multi start` arguments
    celery -A config.celery_app multi stop 3 --pidfile=./celery-%n.pid --logfile=./celery-%n%I.log

    echo "Stopped celery multi..."
    echo "Stopping last waited process"
    kill -s TERM "$child" 2> /dev/null
    echo "Stopped last waited process. Exiting..."
    exit 1
}

# start 3 celery worker via `celery multi` with declared logfile for `tail -f`
celery -A config.celery_app multi start 3 -l INFO -Q:1 queue1 -Q:2 queue1 -Q:3 queue3,celery -c:1-2 1 \
    --pidfile=./celery-%n.pid \
    --logfile=./celery-%n%I.log

# start trapping signals (docker sends `SIGTERM` for shudown)
trap teardown SIGINT SIGTERM

# tail all the logs continuously to console for `docker logs` to see
tail -f ./celery*.log &

# capture process id of `tail` for tear down
child=$!

# waits for `tail -f` indefinitely and allows external signals,
# including docker stop signals, to be captured by `trap`
wait "$child"

Use the code above as the contents of the entrypoint script file, and modify it accordingly to your needs.

Declare it in the dockerfile or docker-compose file in exec form:

ENTRYPOINT ["entrypoint_file"]

The celery workers can then run in the docker container and can also be gracefully stopped.

like image 89
VKen Avatar answered Sep 18 '22 12:09

VKen


I did this task so. I used supervisord instead celery multi. Supervisord start in foreground and my container not closed.

command: supervisord -c supervisord.conf

And I added all queues to supervisord.con

[program:celery]
command = celery worker -A app.celery.celery -l INFO -c 3 -Q q1
directory = %(here)s
startsecs = 5
autostart = true
autorestart = true
stopwaitsecs = 300
stderr_logfile = /dev/stderr
stderr_logfile_maxbytes = 0
stdout_logfile = /dev/stdout
stdout_logfile_maxbytes = 0

[program:beat]
command = celery -A app.celery.celery beat -l INFO --pidfile=/tmp/beat.pid
directory = %(here)s
startsecs = 5
autostart = true
autorestart = true
stopwaitsecs = 300
stderr_logfile = /dev/stderr
stderr_logfile_maxbytes = 0
stdout_logfile = /dev/stdout
stdout_logfile_maxbytes = 0

[supervisord]
loglevel = info
nodaemon = true
pidfile = /tmp/supervisord.pid
logfile = /dev/null
logfile_maxbytes = 0
like image 31
dluhhbiu Avatar answered Sep 19 '22 12:09

dluhhbiu


First, I don't understand the advantage of using multi & docker. As I see it, you want each worker in a separate container. That way you have flexibility and micro-services environment.

If you still want to have multiple workers in the same container, I can suggest workaround to keep your container open by adding while true; do sleep 2; done to the end of your command: celery multi start 2 -l INFO -c:1 3 -c:2 7 -Q:1 queue1 -Q:2 queue2,queue3 && while true; do sleep 2; done.

Alternatively, wrap it in a short script:

#!/bin/bash
celery multi start 2 -l INFO -c:1 3 -c:2 7 -Q:1 queue1 -Q:2 queue2,queue3
while true; do sleep 2; done
like image 30
ItayB Avatar answered Sep 18 '22 12:09

ItayB