Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In celery, how to ensure tasks are retried when worker crashes

First of all please don't consider this question as a duplicate of this question

I have a setup an environment which uses celery and redis as broker and result_backend. My question is how can I make sure that when the celery workers crash, all the scheduled tasks are re-tried, when the celery worker is back up.

I have seen advice on using CELERY_ACKS_LATE = True , so that the broker will re-drive the tasks until it get an ACK, but in my case its not working. Whenever I schedule a task its immediately goes to the worker which persists it until the scheduled time of execution. Let me give some example:

I am scheduling a task like this: res=test_task.apply_async(countdown=600) , but immediately in celery worker logs i can see something like : Got task from broker: test_task[a137c44e-b08e-4569-8677-f84070873fc0] eta:[2013-01-...] . Now when I kill the celery worker, these scheduled tasks are lost. My settings:

BROKER_URL = "redis://localhost:6379/0"  
CELERY_ALWAYS_EAGER = False  
CELERY_RESULT_BACKEND = "redis://localhost:6379/0"  
CELERY_ACKS_LATE = True
like image 414
aqs Avatar asked Jan 18 '13 17:01

aqs


People also ask

How does Celery task queue work?

Celery is an open-source, simple, flexible, distributed system to process vast amounts of messages while providing operations with the tools required to maintain such a system. It's a task queue with a focus on real-time processing, while also supporting task scheduling.

Are Celery tasks asynchronous?

Once you integrate Celery into your app, you can send time-intensive tasks to Celery's task queue. That way, your web app can continue to respond quickly to users while Celery completes expensive operations asynchronously in the background.

Does Celery retry failed tasks?

Celery will stop retrying after 7 failed attempts and raise an exception.

What does Celery task return?

Return as a list of task IDs. The task result backend to use. Collect results as they return. Iterator, like get() will wait for the task to complete, but will also follow AsyncResult and ResultSet returned by the task, yielding (result, value) tuples for each result in the tree.


1 Answers

Apparently this is how celery behaves. When worker is abruptly killed (but dispatching process isn't), the message will be considered as 'failed' even though you have acks_late=True

Motivation (to my understanding) is that if consumer was killed by OS due to out-of-mem, there is no point in redelivering the same task.

You may see the exact issue here: https://github.com/celery/celery/issues/1628

I actually disagree with this behaviour. IMO it would make more sense not to acknowledge.

like image 143
odedfos Avatar answered Oct 09 '22 05:10

odedfos