I have the following set-up:
CELERYD_OPTS="--time-limit=600 -c:low_p 100 -c:high_p 50 -Q:low_p low_priority_queue_name -Q:high_p high_priority_queue_name"
My problem is, sometimes the queue seems to "back up"... that is it will stop consuming tasks. It seems there are to scenarios for this:
celery inspect active
will show that not all workers are used up - that is, I will only see a few active tasksstrace
on the worker processes returns nothing... completely zero activity from the workerI would appreciate any information or pointers on:
strace
to see what the worker processes are doing, but so far that has been useful in telling me that the worker is hangingflower
and events
but they are both excellent in real-time - but don't have any automated monitoring/alarming functionality). Am I just better off writing my own monitoring tools with supervisord?Also, I am starting my tasks from django-celery
A very basic queue watchdog can be implemented with just a single script that’s run every minute by cron. First, it fires off a task that, when executed (in a worker), touches a predefined file, for example:
with open('/var/run/celery-heartbeat', 'w'):
pass
Then the script checks the modification timestamp on that file and, if it’s more than a minute (or 2 minutes, or whatever) away, sends an alarm and/or restarts the workers and/or the broker.
It gets a bit trickier if you have multiple machines, but the same idea applies.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With