Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery not executing new tasks if redis lost connection is restablished

I have a Celery worker configured to connect to redis as follows:

 celery_app_site24x7 = Celery('monitoringExterne.celerys.site24x7',
                             broker=settings.REDIS['broker'], backend=settings.REDIS['backend'])

celery_app_site24x7.conf.broker_transport_options = {
    'visibility_timeout': 36000
}

celery_app_site24x7.conf.socket_timeout = 300
celery_app_site24x7.conf.broker_connection_max_retries = None
celery_app_site24x7.config_from_object('django.conf:settings')
celery_app_site24x7.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

The issue is that if Redis is down and the connection is restablished new tasks added in the queue are not executed:

[2020-01-13 10:10:14,517: ERROR/MainProcess] consumer: Cannot connect to redis://xxx.xxx.xxx.xx:6380/10: Error while reading from socket: ('Connection closed by server.',).
Trying again in 2.00 seconds...

[2020-01-13 10:10:16,590: INFO/MainProcess] Connected to redis://xxx.xxx.xxx.xx:6380/10
[2020-01-13 10:10:16,699: INFO/MainProcess] mingle: searching for neighbors
[2020-01-13 10:10:17,766: INFO/MainProcess] mingle: all alone

I have manually called a celery task through the django shell as follows:

celery_tasks.site24x7.test.delay()

It returns me the Async task ID but the worker doesnot process this task.

<AsyncResult:ff634b85-edb5-44d4-bdb1-17a220761fcc>

If I continue to launch the task as delay the queue keeps incrementing:

127.0.0.1:6379[10]> llen site24x7
(integer) 4
127.0.0.1:6379[10]> llen site24x7
(integer) 5

Below are the output of the celery status and inspect

$ celery -A monitoringExterne --app=monitoringExterne.celerys.site24x7

status Error: No nodes replied within time constraint.

$ celery -A monitoringExterne --app=monitoringExterne.celerys.site24x7 inspect active

Error: No nodes replied within time constraint.

like image 400
Kheshav Sewnundun Avatar asked Sep 20 '25 06:09

Kheshav Sewnundun


2 Answers

If your workers are not subscribed to the site24x7 queue, then the number of tasks in that queue will keep increasing... Try to run the work with somwething like: celery -A monitoringExterne.celerys.site24x7 -Q site24x7 -l info

Also, keep in mind that -A and --app are the same flag, you should not use both.

If you are getting the No nodes replied within time constraint output, that means you have no Celery workers active in your cluster, which also could be the reason why the number of tasks is increasing in that queue - there are no workers to execute them!

like image 160
DejanLekic Avatar answered Sep 22 '25 19:09

DejanLekic


There appears to be an issue with Celery on Redis that may actually be causing this:

Worker stops consuming tasks after redis reconnection

like image 28
Klaas van Schelven Avatar answered Sep 22 '25 20:09

Klaas van Schelven