Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

celery and long running tasks

Tags:

python

celery

I just watch a youtube video where the presenter mentioned that one should design his/her celery to be short. Tasks running several minutes are bad.

Is this correct? What I do see is that I have some long running task, which takes say 10 minutes to finish. When these kind of task is scheduled frequently, the queue is swamped and no other tasks get scheduled. Is this the reason?

If so, what should be used for long running tasks?

like image 357
lang2 Avatar asked Jul 29 '15 05:07

lang2


People also ask

How many tasks can Celery handle?

1 Answer. Show activity on this post. celery beats only trigger those 1000 tasks (by the crontab schedule), not run them.

How do I stop Celery from running?

If a task is revoked, the workers ignore the task and do not execute it. If you don't use persistent revokes your task can be executed after worker's restart. revoke has an terminate option which is False by default. If you need to kill the executing task you need to set terminate to True.

How do you schedule celery tasks?

Some Celery Terminology: A task is just a Python function. You can think of scheduling a task as a time-delayed call to the function. For example, you might ask Celery to call your function task1 with arguments (1, 3, 3) after five minutes. Or you could have your function batchjob called every night at midnight.

Is Celery a task queue?

Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. While it supports scheduling, its focus is on operations in real time.


3 Answers

Augment the basic Task definition to optionally treat the task instantiation as a generator, and check for TERM or soft timeout on every iteration through the generator. Generically inject a "state" dict kwarg into tasks that support it. If it's the first time the task is run, allocate a new one in results cache, otherwise look up the existing one from results cache.

In your task, figure out a good place to yield which results in short execution times. Update the state parameter as necessary.

When control returns to the master task class, check for TERM or soft timeout, and if there is one, save off the state object and respond to the signal.

like image 149
technomage Avatar answered Nov 27 '22 12:11

technomage


The problem with long running tasks is that you have to wait for them when you're pushing a new software version on your server. If you don't wait, your task may run possibly incompatible code, especially if you pickled some complex object as a parameter (which is strongly discouraged).

like image 27
Ctrl-C Avatar answered Nov 27 '22 12:11

Ctrl-C


Long running tasks aren't great but It's by no means appropriate to say they are bad. The best way to handle long running tasks is to create a queue for just those tasks and have them run on a separate worker then the short tasks.

like image 32
user2097159 Avatar answered Nov 27 '22 13:11

user2097159