Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RabbitMQ: What Does Celery Offer That Pika Doesn't?

Tags:

I've been working on getting some distributed tasks working via RabbitMQ.

I spent some time trying to get Celery to do what I wanted and couldn't make it work.

Then I tried using Pika and things just worked, flawlessly, and within minutes.

Is there anything I'm missing out on by using Pika instead of Celery?

like image 534
Jason Champion Avatar asked May 20 '14 17:05

Jason Champion


People also ask

Does Celery use Pika?

You can certainly use pika to implement a distributed task queue if you want, especially if you have a fairly simple use-case. Celery is just providing a "batteries included" solution for task scheduling, management, etc. that you'll have to manually implement if you decide you want them with your pika solution.

How is Celery different from RabbitMQ?

From my understanding, Celery is a distributed task queue, which means the only thing that it should do is dispatching tasks/jobs to others servers and get the result back. RabbitMQ is a message queue, and nothing more. However, a worker could just listen to the MQ and execute the task when a message is received.

Why you should use Celery with RabbitMQ?

It's incredibly lightweight, supports multiple brokers (RabbitMQ, Redis, and Amazon SQS), and also integrates with many web frameworks, e.g. Django, etc. Celery's asynchronous task queue allows the execution of tasks and its concurrency makes it useful in several production systems.

What is Pika in RabbitMQ?

Pika is a Python implementation of the AMQP 0-9-1 protocol for RabbitMQ. This tutorial guides you through installing Pika, declaring a queue, setting up a publisher to send messages to the broker's default exchange, and setting up a consumer to recieve messages from the queue.

Why use celery instead of RabbitMQ?

Why use Celery instead of RabbitMQ? From my understanding, Celery is a distributed task queue, which means the only thing that it should do is dispatching tasks/jobs to others servers and get the result back. RabbitMQ is a message queue, and nothing more. However, a worker could just listen to the MQ and execute the task when a message is received.

What is the use of task queue in celery?

Celery is a powerful asynchronous task queue based on distributed message passing that allows us to run time-consuming tasks in the background. Celery uses a message broker to communicate with workers. So, basically, Celery initiates a new task by adding a message to the queue. A Celery worker then retrieves this task to start processing it.

Can cloudamqp be used with celery?

While CloudAMQP provides a message broker, it is also possible to deploy Celery workers on AWS or another cloud service. Workers only need to know where the broker resides to become a part of the system. Celery maintains a queue for events and notifications without a common registry node.

Should I use quorum queues in RabbitMQ?

Update February 2021: RabbitMQ strongly advice you to use Quorum Queues in favour of classic mirrored queues. Read: Reasons you should switch to Quorum Queues . Don’t open and close connections or channels repeatedly. Have long-lived connections if possible, and use channels for each task.


2 Answers

What pika provides is just a small piece of what Celery is doing. Pika is Python library for interacting with RabbitMQ. RabbitMQ is a message broker; at its core, it just sends messages to/receives messages from queues. It can be used as a task queue, but it could also just be used to pass messages between processes, without actually distributing "work".

Celery implements an distributed task queue, optionally using RabbitMQ as a broker for IPC. Rather than just providing a way of sending messages between processes, it's providing a system for distributing actual tasks/jobs between processes. Here's how Celery's site describes it:

Task queues are used as a mechanism to distribute work across threads or machines.

A task queue’s input is a unit of work, called a task, dedicated worker processes then constantly monitor the queue for new work to perform.

Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task a client puts a message on the queue, the broker then delivers the message to a worker.

A Celery system can consist of multiple workers and brokers, giving way to high availability and horizontal scaling.

Celery has a whole bunch of functionality built-in that is outside of pika's scope. You can take a look at the Celery docs to get an idea of the sort of things it can do, but here's an example:

>>> from proj.tasks import add  >>> res = add.chunks(zip(range(100), range(100)), 10)() >>> res.get() [[0, 2, 4, 6, 8, 10, 12, 14, 16, 18],  [20, 22, 24, 26, 28, 30, 32, 34, 36, 38],  [40, 42, 44, 46, 48, 50, 52, 54, 56, 58],  [60, 62, 64, 66, 68, 70, 72, 74, 76, 78],  [80, 82, 84, 86, 88, 90, 92, 94, 96, 98],  [100, 102, 104, 106, 108, 110, 112, 114, 116, 118],  [120, 122, 124, 126, 128, 130, 132, 134, 136, 138],  [140, 142, 144, 146, 148, 150, 152, 154, 156, 158],  [160, 162, 164, 166, 168, 170, 172, 174, 176, 178],  [180, 182, 184, 186, 188, 190, 192, 194, 196, 198]] 

This code wants to add every x+y where x is in range(0, 100) and y is in range(0,100). It does this by taking a task called add, which adds two numbers, and distributing the work of adding 1+1, 2+2, 3+3, etc, into chunks of 10, and distributing each chunk to as many Celery workers as there are available. Each worker will run add on its 10 item chunk, until all the work is complete. Then the results are gathered up by the res.get() call. I'm sure you can imagine a way to do this using pika, but I'm sure you can also imagine how much work would be required. You're getting that functionality out of the box with Celery.

You can certainly use pika to implement a distributed task queue if you want, especially if you have a fairly simple use-case. Celery is just providing a "batteries included" solution for task scheduling, management, etc. that you'll have to manually implement if you decide you want them with your pika solution.

like image 176
dano Avatar answered Sep 19 '22 04:09

dano


I’m going to add an answer here because this is the second time today someone has recommended celery when not needed based on this answer I suspect. So the difference between a distributed task queue and a broker is that a broker just passes messages. Nothing more, nothing less. Celery recommends using RabbitMQ as the default broker for IPC and places on top of that adapters to manage task/queues with daemon processes. While this is useful especially for distributed tasks where you need something generic very quickly. It’s just construct for the publisher/consumer process. Actual tasks where you have defined workflow that you need to step through and ensure message durability based on your specific needs, you’d be better off writing your own publisher/consumer than relying on celery. Obviously you still have to do all of the durability checking etc. With most web related services one doesn’t control the actual “work” units but rather, passes them off to a service. Thus it makes little sense for a distributed tasks queue unless you’re hitting some arbitrary API call limit based on ip/geographical region or account number... Or something along those lines. So using celery doesn’t stop you from having to write or deal with state code or management of workflow etc and it exposes the AMQP in a way that makes it easy for you to avoid writing the constructs of publisher/consumer code.

So in short if you need a simple tasks queue to chew through work and you aren’t really concerned about the nuances of performance, the intricacies of durability through your workflow or the actual publish/consume processes. Celery works. If you are just passing messages to an api or service you don't actually control, sure, you could use Celery but you could just as easily whip up your own publisher/consumer with Pika in a couple of minutes. If you need something robust or that adheres to your own durability scenarios, write your own publish/consumer code like everyone else.

like image 33
Christopher Warner at mdsol Avatar answered Sep 22 '22 04:09

Christopher Warner at mdsol