Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I troubleshoot an exit timeout of celeryd when running on Heroku (error R12)?

I'm running celeryd on a Heroku dyno. When I shut it down and it has previously processed (even completed) at least one task, it doesn't shut down properly and I'm getting an error R12 (exit timeout) from Heroku.

Here's how I'm running celeryd from my Procfile (through Django and django-celery):

celeryd: python manage.py celeryd -E --loglevel=INFO

Here's what I'm doing to trigger it:

> heroku ps:scale web=0 celeryd=0 --app myapp

And here's the log output I'm getting:

2012-09-07T12:56:31+00:00 heroku[celeryd.1]: State changed from up to down
2012-09-07T12:56:31+00:00 heroku[api]: Scale to celeryd=0, web=1 by [email protected]
2012-09-07T12:56:32+00:00 heroku[web.1]: State changed from up to down
2012-09-07T12:56:32+00:00 heroku[api]: Scale to web=0 by [email protected]
2012-09-07T12:56:34+00:00 heroku[celeryd.1]: Stopping all processes with SIGTERM
2012-09-07T12:56:35+00:00 heroku[web.1]: Stopping all processes with SIGTERM
2012-09-07T12:56:37+00:00 heroku[web.1]: Process exited with status 143
2012-09-07T12:56:43+00:00 heroku[celeryd.1]: Error R12 (Exit timeout) -> At least one process failed to exit within 10 seconds of SIGTERM
2012-09-07T12:56:43+00:00 heroku[celeryd.1]: Stopping remaining processes with SIGKILL
2012-09-07T12:56:45+00:00 heroku[celeryd.1]: Process exited with status 137

Originally, I experienced this on celery 2.5.5. Now I upgraded to 3.0.9 and I still have the same problem.

As far as I can tell, my tasks have all completed. This error is reliably reproducible by running a single task on that celery dyno, giving it enough time to complete and then shutting the dyno down.

I don't know what else to check. Any idea how I can troubleshoot this? What could block celeryd from responding to Heroku's SIGTERM when the task has already completed?

like image 409
Henrik Heimbuerger Avatar asked Sep 07 '12 13:09

Henrik Heimbuerger


1 Answers

I'm encountering the same issue. I'm not sure, but it may have been fixed:

Worker with -B argument did not properly shut down the beat instance.

So if you're using celery beat inside a worker instance, you might need to upgrade.

like image 196
Scott Coates Avatar answered Oct 02 '22 17:10

Scott Coates