Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which pool class should i use prefork, eventlet or gevent in celery?

I have 3 remote workers, each one is running with default pool (prefork) and single task.

A single task is taking 2 to 5 minutes for completion as it runs on many different tools and inserts database in ELK.

worker command: celery -A project worker -l info

Which pool class should I use to make processing faster?

is there any other method to improve performance?

like image 901
Md Sadique Avatar asked Mar 22 '17 10:03

Md Sadique


People also ask

What is Prefork in celery?

Prefork. The prefork pool implementation is based on Python's multiprocessing package. It allows your Celery worker to side-step Python's Global Interpreter Lock and fully leverage multiple processors on a given machine. You want to use the prefork pool if your tasks are CPU bound.

What is Eventlet in celery?

Eventlet is a concurrent networking library for Python that allows you to change how you run your code, not how you write it. It uses epoll or kqueue or libevent for highly scalable non-blocking I/O.

Does celery use multiprocessing?

Celery itself is using billiard (a multiprocessing fork) to run your tasks in separate processes.

What are workers in celery?

When you run a celery worker, it creates one parent process to manage the running tasks. This process handles the book keeping features like sending/receiving queue messages, registering tasks, killing hung tasks, tracking status, etc.


1 Answers

funny that this question scrolled by.

We just switched from eventlet to gevent. Eventlet caused hanging broker connections which ultimately stalled the workers.

General tips:

  • Use a higher concurreny if you're I/O bound, I would start with 25, check the cpu load and tweak from there, aim for 99,9% cpu usage for the process.
  • you might want to use --without-gossip and --without-mingle if your workforce grows.
  • don't use RabbitMQ as your result backend (redis ftw!), but RabbitMQ is our first choice when it comes to a broker (the amqp emulation on redis and the hacky async-redis solution of celery is smelly and caused a lot of grief in our past).

More advanced options to tune your celery workers:

  • pin each worker process to one core to avoid the overhead of moving processes around (taskset is your friend)
  • if one worker isn't always working, consider core-sharing with one or two other processes, use nice if one process has priority
like image 143
ACimander Avatar answered Oct 14 '22 23:10

ACimander