Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery - run different workers on one server

Tags:

django

celery

I have 2 kind of tasks : Type1 - A few of high priority small tasks. Type2 - Lot of heavy tasks with lower priority.

Initially i had simple configuration with default routing, no routing keys were used. It was not sufficient - sometimes all workers were busy with Type2 Tasks, so Task1 were delayed. I've added routing keys:

CELERY_DEFAULT_QUEUE = "default" CELERY_QUEUES = {     "default": {         "binding_key": "task.#",     },     "highs": {         "binding_key": "starter.#",     }, } CELERY_DEFAULT_EXCHANGE = "tasks" CELERY_DEFAULT_EXCHANGE_TYPE = "topic" CELERY_DEFAULT_ROUTING_KEY = "task.default"  CELERY_ROUTES = {         "search.starter.start": {             "queue": "highs",             "routing_key": "starter.starter",         }, } 

So now i have 2 queues - with high and low priority tasks.

Problem is - how to start 2 celeryd's with different concurrency settings?

Previously celery was used in daemon mode(according to this), so only start of /etc/init.d/celeryd start was required, but now i have to run 2 different celeryds with different queues and concurrency. How can i do it?

like image 494
Andrew Avatar asked Mar 28 '11 18:03

Andrew


People also ask

How do you run multiple celery workers?

You probably just need to add the --concurrency or -c argument when starting the worker to spawn multiple (parallel) worker instances. Show activity on this post. You can look for Canvas primitives there you can see how to make groups for parallel execution. class celery.

How many celery workers can I run?

There is no point in running more than one worker on a particular machine unless you want to do routing. I would suggest running only 1 worker per machine with the default number of processes.

How does celery define workers?

When you run a celery worker, it creates one parent process to manage the running tasks. This process handles the book keeping features like sending/receiving queue messages, registering tasks, killing hung tasks, tracking status, etc.

Can celery tasks be async?

Celery is a task queue/job queue based on asynchronous message passing. It can be used as a background task processor for your application in which you dump your tasks to execute in the background or at any given moment. It can be configured to execute your tasks synchronously or asynchronously.


2 Answers

Based on the above answer, I formulated the following /etc/default/celeryd file (originally based on the configuration described in the docs here: http://ask.github.com/celery/cookbook/daemonizing.html) which works for running two celery workers on the same machine, each worker servicing a different queue (in this case the queue names are "default" and "important").

Basically this answer is just an extension of the previous answer in that it simply shows how to do the same thing, but for celery in daemon mode. Please note that we are using django-celery here:

CELERYD_NODES="w1 w2"  # Where to chdir at start. CELERYD_CHDIR="/home/peedee/projects/myproject/myproject"  # Python interpreter from environment. #ENV_PYTHON="$CELERYD_CHDIR/env/bin/python" ENV_PYTHON="/home/peedee/projects/myproject/myproject-env/bin/python"  # How to call "manage.py celeryd_multi" CELERYD_MULTI="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryd_multi"  # How to call "manage.py celeryctl" CELERYCTL="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryctl"  # Extra arguments to celeryd # Longest task: 10 hrs (as of writing this, the UpdateQuanitites task takes 5.5 hrs) CELERYD_OPTS="-Q:w1 default -c:w1 2 -Q:w2 important -c:w2 2 --time-limit=36000 -E"  # Name of the celery config module. CELERY_CONFIG_MODULE="celeryconfig"  # %n will be replaced with the nodename. CELERYD_LOG_FILE="/var/log/celery/celeryd.log" CELERYD_PID_FILE="/var/run/celery/%n.pid"  # Name of the projects settings module. export DJANGO_SETTINGS_MODULE="settings"  # celerycam configuration CELERYEV_CAM="djcelery.snapshot.Camera" CELERYEV="$ENV_PYTHON $CELERYD_CHDIR/manage.py celerycam" CELERYEV_LOG_FILE="/var/log/celery/celerycam.log"  # Where to chdir at start. CELERYBEAT_CHDIR="/home/peedee/projects/cottonon/cottonon"  # Path to celerybeat CELERYBEAT="$ENV_PYTHON $CELERYBEAT_CHDIR/manage.py celerybeat"  # Extra arguments to celerybeat.  This is a file that will get # created for scheduled tasks.  It's generated automatically # when Celerybeat starts. CELERYBEAT_OPTS="--schedule=/var/run/celerybeat-schedule"  # Log level. Can be one of DEBUG, INFO, WARNING, ERROR or CRITICAL. CELERYBEAT_LOG_LEVEL="INFO"  # Log file locations CELERYBEAT_LOGFILE="/var/log/celerybeat.log" CELERYBEAT_PIDFILE="/var/run/celerybeat.pid" 
like image 196
eedeep Avatar answered Sep 22 '22 17:09

eedeep


It seems answer - celery-multi - is currently not documented well.

What I needed can be done by the following command:

celeryd-multi start 2 -Q:1 default -Q:2 starters -c:1 5 -c:2 3 --loglevel=INFO --pidfile=/var/run/celery/${USER}%n.pid --logfile=/var/log/celeryd.${USER}%n.log 

What we do is starting 2 workers, which are listening to different queues (-Q:1 is default, Q:2 is starters ) with different concurrencies -c:1 5 -c:2 3

like image 39
Andrew Avatar answered Sep 21 '22 17:09

Andrew