My platform runs through a lot of tasks (several thousand per day). Some of the longer tasks them keep failing with the following error:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python2.7/site-packages/billiard/pool.py", line 1167, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: exitcode 0.
According to Celery's Flower, which doesn't provide anything more than the posted traceback, the task was received ( 2014-12-22 22:46:46.196814 ) four minutes before it was started ( 2014-12-22 22:50:03.469647 ), and failed in just ten seconds (epoch 1419288613.34 or 2014-12-22 22:50:13 ).
This has been a recurring problem on my platform. It happens mostly with tasks which run scrapy 0.24.2 but it may also happen with other tasks.
Other durations of WorkerLostError (also with an exit code of zero) are three minutes, five minutes, or seven minutes.
Any thoughts on what could be causing this? All tasks run perfectly fine locally. Thanks.
My recommendation is to check all of the modules you are using and your code for 'raise BaseException'. I ran into the issue with WorkerLostError exitcode 0.
After a lot of debugging and figuring out specifically where tasks were failing, I found that it was when BaseException was raised. Instead of providing the error message, WorkerLostError occurred.
By changing to 'raise Exception', the actual error message was provided when something went wrong inside the task. This might not be the same for your case, but it was what I found when dealing with the same error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With