I have been happily running celery+rabbitmq+django for a month or so in production. Yesterday, I decided to upgrade from celery 2.1.4 to 2.2.4 and now rabbitmq is spinning out of control. After running for a while, my nodes are no longer recognized by evcam, and beam.smp's memory consumptions starts increasing...slowly (100+% CPU usage).
I can run rabbitmqctl list_connections
and see that there is nothing unusual (just my one test node). I can see in rabbitmqctl list_queues -p <VHOST>
that there are no messages except the heartbeat from my test node. If I let the process keep running over a couple of hours it maxes out the machine.
I've tried purging the various queues using camqadm
to no avail and stop_app
just hangs. The only way that I have found to 'fix' it is to kill -9
beam.smp (and all related processes) and force_reset on my rabbitmq server.
I have no idea how to go about debugging this. There doesn't appear to be anything fishy going on as far as new messages etc. Has anybody run up against this before? Any ideas? What other information should I be looking at?
"Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent.
Celery is a framework for performing asynchronous tasks in your application. Celery is written in Python and makes it very easy to offload work out of the synchronous request lifecycle of a web app onto a pool of task workers to perform jobs asynchronously.
Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task, the Celery client adds a message to the queue, and the broker then delivers that message to a worker. The most commonly used brokers are Redis and RabbitMQ.
Celery is an open source asynchronous task queue or job queue which is based on distributed message passing. While it supports scheduling, its focus is on operations in real time.
The celery developer told me 3 months ago that the versions of RabbitMQ after the 2.1.1 was affected by memory leak, with cpu peaks. I'm still using the version 2.1.1 and I don't have this problem
http://www.rabbitmq.com/releases/rabbitmq-server/v2.1.1/
Is also true that the celery 2.2.4 version introduced some memory problem, but if you update to celery 2.2.5 most of them are solved.
http://docs.celeryproject.org/en/v2.2.5/changelog.html#fixes
I hope this could help
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With