Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery - minimize memory consumption

We have ~300 celeryd processes running under Ubuntu 10.4 64-bit , in idle every process takes ~19mb RES, ~174mb VIRT, thus - it's around 6GB of RAM in idle for all processes. In active state - process takes up to 100mb of RES and ~300mb VIRT

Every process uses minidom(xml files are < 500kb, simple structure) and urllib.

Quetions is - how can we decrease RAM consuption - at least for idle workers, probably some celery or python options may help? How to determine which part takes most of memory?

UPD: thats flight search agents, one worker for one agency/date. We have 10 agencies, one user search == 9 dates, thus we have 10*9 agents per one user search.

Is it possible start celeryd processes on demand to avoid idle workers(something like MaxSpareServers on apache)?

UPD2: Agent lifecycle is - send HTTP request, wait for response ~10-20 sec, parse xml( takes less then 0.02s), save result to MySQL

like image 431
Andrew Avatar asked Dec 03 '10 14:12

Andrew


3 Answers

Read this:

http://docs.celeryproject.org/en/latest/userguide/workers.html#concurrency

It sounds like you have one worker per celeryd. That seems wrong. You should have dozens of workers per celeryd. Keep raising the number of workers (and lowering the number of celeryd's) until your system is very busy and very slow.

like image 189
S.Lott Avatar answered Nov 01 '22 12:11

S.Lott


S. Lott is right. The main instance consumes messages and delegates them to worker pool processes. There is probably no point in running 300 pool processes on a single machine! Try 4 or 5 multiplied by the number of CPU cores. You may gain something by running more than on celeryd with a few processes each, some people have, but you would have to experiment for your application.

See http://celeryq.org/docs/userguide/workers.html#concurrency

For the upcoming 2.2 release we're working on Eventlet pool support, this may be a good alternative for IO-bound tasks, that will enable you to run 1000+ threads with minimal memory overhead, but it's still experimental and bugs are being fixed for the final release.

See http://groups.google.com/group/celery-users/browse_thread/thread/94fbeccd790e6c04

The upcoming 2.2 release also have support for autoscale, which adds/removes process on demand. See the Changelog: http://ask.github.com/celery/changelog.html#version-2-2-0 (this changelog is not completly written yet)

like image 22
asksol Avatar answered Nov 01 '22 12:11

asksol


The natural number of workers is close to the number of cores you have. The workers are there so that cpu-intensive tasks can use an entire core efficiently. The broker is there so that requests that don't have a worker on hand to process them are kept queued. The number of queues can be high, but that doesn't mean you need a high number of brokers either. A single broker should suffice, or you could shard queues to one broker per machine if it later turns out fast worker-queue interaction is beneficial.

Your problem seems unrelated to that. I'm guessing that your agencies don't provide a message queue api, and you have to keep around lots of requests. If so, you need a few (emphasis on not many) evented processes, for example twisted or node.js based.

like image 25
Tobu Avatar answered Nov 01 '22 14:11

Tobu