I just watch a youtube video where the presenter mentioned that one should design his/her celery to be short. Tasks running several minutes are bad.
Is this correct? What I do see is that I have some long running task, which takes say 10 minutes to finish. When these kind of task is scheduled frequently, the queue is swamped and no other tasks get scheduled. Is this the reason?
If so, what should be used for long running tasks?
1 Answer. Show activity on this post. celery beats only trigger those 1000 tasks (by the crontab schedule), not run them.
If a task is revoked, the workers ignore the task and do not execute it. If you don't use persistent revokes your task can be executed after worker's restart. revoke has an terminate option which is False by default. If you need to kill the executing task you need to set terminate to True.
Some Celery Terminology: A task is just a Python function. You can think of scheduling a task as a time-delayed call to the function. For example, you might ask Celery to call your function task1 with arguments (1, 3, 3) after five minutes. Or you could have your function batchjob called every night at midnight.
Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. While it supports scheduling, its focus is on operations in real time.
Augment the basic Task
definition to optionally treat the task instantiation as a generator, and check for TERM or soft timeout on every iteration through the generator. Generically inject a "state" dict kwarg into tasks that support it. If it's the first time the task is run, allocate a new one in results cache, otherwise look up the existing one from results cache.
In your task, figure out a good place to yield which results in short execution times. Update the state
parameter as necessary.
When control returns to the master task class, check for TERM or soft timeout, and if there is one, save off the state
object and respond to the signal.
The problem with long running tasks is that you have to wait for them when you're pushing a new software version on your server. If you don't wait, your task may run possibly incompatible code, especially if you pickled some complex object as a parameter (which is strongly discouraged).
Long running tasks aren't great but It's by no means appropriate to say they are bad. The best way to handle long running tasks is to create a queue for just those tasks and have them run on a separate worker then the short tasks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With