Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent multiple workers from running a task sent ONLY ONCE?

Recently I noticed a weird Celery (3.1.25) behaviour. A task is queued for execution using the send_task() only once, however after a while I see multiple running the same task! I have spent hours looking at Celery documentation trying to find out how to prevent this behaviour. Any help will be greatly appreciated!

Here is the out of the inspect active:

...
-> [email protected]: OK
    * {'hostname': '[email protected]', 'id': '5bf971b7-c2d2-47a1-9e3e-abec6c3c7ab4', 'args': "['myex', 'equities', 20170103]", 'time_start': 1633747.663716712, 'name': 'parsing.2pass', 'acknowledged': False, 'delivery_info': {'exchange': 'celery', 'priority': 0, 'redelivered': None, 'routing_key': 'celery'}, 'worker_pid': 28649, 'kwargs': '{}'}
    * {'hostname': '[email protected]', 'id': '5bf971b7-c2d2-47a1-9e3e-abec6c3c7ab4', 'args': "['myex', 'equities', 20170103]", 'time_start': 1637348.143546186, 'name': 'parsing.2pass', 'acknowledged': False, 'delivery_info': {'exchange': 'celery', 'priority': 0, 'redelivered': None, 'routing_key': 'celery'}, 'worker_pid': 1550, 'kwargs': '{}'}
-> [email protected]: OK
    * {'hostname': '[email protected]', 'id': '5bf971b7-c2d2-47a1-9e3e-abec6c3c7ab4', 'args': "['myex', 'equities', 20170103]", 'time_start': 1626395.204211438, 'name': 'parsing.2pass', 'acknowledged': False, 'delivery_info': {'exchange': 'celery', 'priority': 0, 'redelivered': None, 'routing_key': 'celery'}, 'worker_pid': 26978, 'kwargs': '{}'}
-> [email protected]: OK
    * {'hostname': '[email protected]', 'id': '5bf971b7-c2d2-47a1-9e3e-abec6c3c7ab4', 'args': "['myex', 'equities', 20170103]", 'time_start': 1630146.08942695, 'name': 'parsing.2pass', 'acknowledged': False, 'delivery_info': {'exchange': 'celery', 'priority': 0, 'redelivered': None, 'routing_key': 'celery'}, 'worker_pid': 19473, 'kwargs': '{}'}
...

Notice that the task 5bf971b7-c2d2-47a1-9e3e-abec6c3c7ab4 is running on at least 3 workers, even though it was triggered by a single send_task() call. We use Redis as broker with all the defaults (no fancy exchanges, and routes).

like image 340
DejanLekic Avatar asked Nov 09 '22 04:11

DejanLekic


1 Answers

There could be several possible reasons for this behaviour.

  • Maybe you started celery with celerybeat service. And in this case there should be only one celery process. In other cases every process will schedule the same task.
  • Maybe you should adjust yours queues. Since redis is using broadcast messages to deliver a tasks. More info here Even if you're not using ETA/countdown this could be the reason for duplication

Anyway, you can prevent this on your side by using celery_once. The main idea is to check within the task if it was promoted and executed already. Yes, this looks like a workaround, but it works pretty good.

like image 176
Nikolay Osaulenko Avatar answered Jan 04 '23 03:01

Nikolay Osaulenko