Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Celery, how can I keep long-delayed tasks from blocking newer ones?

I have two kinds of tasks. Task A is generated by celerybeat every hour. It runs immediately, and generates a thousand (or many thousand) instances of Task B, each of which has an ETA of one day in the future.

Upon startup, an instance of Task A runs and generates a thousand Bs. And from then on, nothing happens. I should see another A running each hour, with another thousand Bs. But in fact I see nothing.

At the freeze, rabbitmqctl shows 1000 messages, with 968 ready and 32 unacknowledged. An hour later there are 1001 messages, 969 ready and 32 unacknowledged. And so forth, every hour one new message classified as ready. Presumably what's happening is that the worker is prefetching 32 messages but can't act on them, because their ETA is still in the future. Meantime, newer tasks that should run right now can't be run.

What is the right way to handle this? I'm guessing that I need multiple workers, and maybe multiple queues (but I'm not sure of the latter point). Is there a simpler way? I've tried fiddling with CELERYD_PREFETCH_MULTIPLIER and -Ofail (as discussed here: http://celery.readthedocs.org/en/latest/userguide/optimizing.html) but can't get it to go. Is my question the same as this one: [[Django Celery]] Celery blocked doing IO tasks ?

In any case: I can address this issue only because I know a lot about the nature of the tasks and their timing. Doesn't it seem a design flaw that enough tasks with future ETA can lock up the whole system? If I wait a few hours, then kill and restart the worker, it once again grabs the first 32 tasks and freezes up, even though at this point there are tasks in the queue that are ready to run right now. Shouldn't some component be smart enough to look at ETAs and ignore tasks that aren't runnable?

ADDENDUM: I now think that the problem is a known bug when RabbitMQ 3.3 is used with Celery 3.1.0. More information here: https://groups.google.com/forum/#!searchin/celery-users/countdown|sort:date/celery-users/FiAAESOzezA/499OH-pylacJ

After updating to Celery 3.1.1, things seem better. Task A runs hourly (well, it has for a couple of hours) and schedules its copies of Task B. Those seem to be filling up the worker: The number of unacknowledged messages continues to grow. I'll have to see if it can grow without bound.

like image 905
user620316 Avatar asked May 13 '14 15:05

user620316


People also ask

What does delay do in Celery?

delay() is the quickest way to send a task message to Celery. This method is a shortcut to the more powerful . apply_async() , which additionally supports execution options for fine-tuning your task message.

What is Shared_task in Celery?

The "shared_task" decorator allows creation of Celery tasks for reusable apps as it doesn't need the instance of the Celery app. It is also easier way to define a task as you don't need to import the Celery app instance.

How many tasks can Celery handle?

celery beats only trigger those 1000 tasks (by the crontab schedule), not run them. If you want to run 1000 tasks in parallel, you should have enough celery workers available to run those tasks.

What is concurrency in Celery?

As for --concurrency celery by default uses multiprocessing to perform concurrent execution of tasks. The number of worker processes/threads can be changed using the --concurrency argument and defaults to the number of available CPU's if not set.


1 Answers

It seems like this is an issue that can be solved with routing: http://celery.readthedocs.org/en/latest/userguide/routing.html

When using routing, you can have multiple queues that are populated with different types of tasks. If you want task B to not block more task A, you could make them into separate worker queues with different priority such that your workers will work on the large queue full of Task Bs but when an Task A arrive it is pulled by the next available worker.

The added benefit of this is that you can also assign more workers to heavily filled queues and those workers will only pull from the designated high volume queue.

like image 195
Kyle Owens Avatar answered Sep 24 '22 02:09

Kyle Owens