Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

celery tasks with long eta (8+ hours) are executed multiple times in a row when eta is reached

I'm creating a task with eta ranging between 3 and 20 hours and when I look at the worker log, for this task, the worker says "Got task from broker: ..." every hour after the original task was received until the eta is reached.

I know that this has to do with setting BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': X} where X is the number in seconds.

So I played with the visibility_timeout and if I set it to anything less than 1 hour then I can see worker getting the same task every X seconds, however when I set the visibility_timeout to X being larger than 1 hour then it keeps defaulting to 1h regardless of the time I set.

Does anyone else run into this issue? Is this a know bug?

I'm using Celery 3.0.11 (Chiastic Slide) with Redis server version 2.4.15

like image 434
user1713317 Avatar asked Oct 02 '12 01:10

user1713317


1 Answers

EDIT: Any message consumer using kombu* connected to the same Redis URL will help restoring unacked messages, so you have to make sure all of them are configured with the same visibility_timeout value.

A common mistake is starting the Flower monitor like this:

celery flower -b redis://somewhere

instead of like this:

celery -A proj flower

as the former means the flower instance will not be configured with the celery configuration, and then be missing BROKER_TRANSPORT_OPTIONS and the visibility_timeout setting.

In addition to this you also have to make sure that wall clocks are in sync using ntp, as described in the original reply below.

  • kombu is the messaging library used by Celery.

Original reply:

Even though I haven't heard about anything like this, it could be a bug. I added some print statements to kombu/transport/redis.py to check if the visibility_timeout was set correctly, and it is definitely for me. Testing that it works with values greater than an hour will take more time though (about 2 hours to be exact) so I can report back then.

In the mean time you could verify that you are setting the visiblity_timeout correctly by adding the print statement yourself (e.g. to the restore_visible method in the redis transport)

Note that this feature is using timestamps, so if you have more than one machine it is important that the clocks are pretty much in sync (especially not drifting away by hours). You should always use ntp on networked servers and sync regularly.

like image 186
asksol Avatar answered Sep 30 '22 16:09

asksol