Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scheduling celery tasks with large ETA

I am currently experimenting with future tasks in celery using the ETA feature and a redis broker. One of the known issues with using a redis broker has to do with the visibility timeout:

If a task isn’t acknowledged within the Visibility Timeout the task will be redelivered to another worker and executed.

This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop.

Some tasks that I can envision will have an ETA on the timescale of weeks/months. Setting the visibility timeout large enough to encompass these tasks is probably unwise.

Are there any paths forward for processing these tasks with a redis broker? I am aware of this question. Is changing brokers the only option?

like image 224
smang Avatar asked Aug 04 '17 23:08

smang


People also ask

How do you schedule celery tasks?

A task is just a Python function. You can think of scheduling a task as a time-delayed call to the function. For example, you might ask Celery to call your function task1 with arguments (1, 3, 3) after five minutes. Or you could have your function batchjob called every night at midnight.

How many tasks can Celery handle?

celery beats only trigger those 1000 tasks (by the crontab schedule), not run them. If you want to run 1000 tasks in parallel, you should have enough celery workers available to run those tasks.

What is delay in Celery?

delay(*args, **kwargs) Shortcut to send a task message, but doesn't support execution options. calling ( __call__ ) Applying an object supporting the calling API (e.g., add(2, 2) ) means that the task will not be executed by a worker, but in the current process instead (a message won't be sent).

Can Celery run multiple workers?

Not only CAN Celery run more than one worker, that is in fact the very point, and reason Celery even exists and it's whole job is to manage not just multiple workers but conceivably across machines and distributed.


1 Answers

I am doing this with redis in the following way:

We have customers that can schedule a release of some of their content. We store the release in our database with the time it should be executed at.

Then we use celery beat to perform a periodic task (hourly or what suits you) that checks our releases table for releases that are scheduled within the next period (again hour or what suits you). if any are found we then schedule a task for them with celery. This allows us to have a short ETA.

like image 94
NG. Avatar answered Oct 05 '22 22:10

NG.