I am trying to get RabbitMQ with Celery and Django going on an EC2 instance to do some pretty basic background processing. I'm running rabbitmq-server 2.5.0 on a large EC2 instance.
I downloaded and installed the test client per the instructions here (at the very bottom of the page). I have been just letting the test script go and am getting the expected output:
recving rate: 2350 msg/s, min/avg/max latency: 588078478/588352905/588588968 microseconds
recving rate: 1844 msg/s, min/avg/max latency: 588589350/588845737/589195341 microseconds
recving rate: 1562 msg/s, min/avg/max latency: 589182735/589571192/589959071 microseconds
recving rate: 2080 msg/s, min/avg/max latency: 589959557/590284302/590679611 microseconds
The problem is that it is consuming an incredible amount of CPU:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
668 rabbitmq 20 0 618m 506m 2340 S 166 6.8 2:31.53 beam.smp
1301 ubuntu 20 0 2142m 90m 9128 S 17 1.2 0:24.75 java
I was testing on a micro instance earlier and it was completely consuming all resources on the instance.
Is this to be expected? Am I doing something wrong?
Thanks.
Edit:
The real reason for this post was that celerybeat seemed to run okay for awhile and then suddenly consume all resources on the system. I installed the rabbitmq management tools and have been investigating how the queues are created from celery and from the rabbitmq test suite. It seems to me that celery is orphaning these queues and they are not going away.
Here is the queue as generated by the test suite. One queue is created and all the messages go into it and come out:
Celerybeat creates a new queue for every time it runs the task:
It sets the auto-delete parameter to true, but I'm not entirely sure when these queues will get deleted. They seem to just slowly build up and eat resources.
Does anyone have an idea?
Thanks.
Ok, I figured it out.
Here's the relevant piece of documentation: http://readthedocs.org/docs/celery/latest/userguide/tasks.html#amqp-result-backend
Old results will not be cleaned automatically, so you must make sure to consume the results or else the number of queues will eventually go out of control. If you’re running RabbitMQ 2.1.1 or higher you can take advantage of the x-expires argument to queues, which will expire queues after a certain time limit after they are unused. The queue expiry can be set (in seconds) by the CELERY_AMQP_TASK_RESULT_EXPIRES setting (not enabled by default).
To add to Eric Conner's solution to his own problem, http://docs.celeryproject.org/en/latest/userguide/tasks.html#tips-and-best-practices states:
Ignore results you don’t want
If you don’t care about the results of a task, be sure to set the ignore_result option, as storing results wastes time and resources.
@app.task(ignore_result=True) def mytask(…): something()
Results can even be disabled globally using the CELERY_IGNORE_RESULT setting.
That along with Eric's answer is probably a bare minimum best practices for managing your results backend.
If you don't need a results backend, set CELERY_IGNORE_RESULT or don't set a results backend at all. If you do need a results backend, set CELERY_AMQP_TASK_RESULT_EXPIRES to be safeguarded against unused results building up. If you don't need it for a specific app, set the local ignore as above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With