Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a performance difference between pooling connections or channels in rabbitmq?

I'm a newbie with Rabbitmq(and programming) so sorry in advance if this is obvious. I am creating a pool to share between threads that are working on a queue but I'm not sure if I should use connections or channels in the pool.

I know I need channels to do the actual work but is there a performance benefit of having one channel per connection(in terms of more throughput from the queue)? or am I better off just using a single connection per application and pool many channels?

note: because I'm pooling the resources the initial cost is not a factor, as I know connections are more expensive than channels. I'm more interested in throughput.

like image 489
Lostsoul Avatar asked May 02 '12 04:05

Lostsoul


People also ask

What is the difference between channel and queue in RabbitMQ?

Queue: Buffer that stores messages. Message: Information that is sent from the producer to a consumer through RabbitMQ. Connection: A TCP connection between your application and the RabbitMQ broker. Channel: A virtual connection inside a connection.

How many connections can RabbitMQ handle?

Below is the default TCP socket option configuration used by RabbitMQ: TCP connection backlog is limited to 128 connections.

Why should Connection Pooling be used?

Using connection pools helps to both alleviate connection management overhead and decrease development tasks for data access. Each time an application attempts to access a backend store (such as a database), it requires resources to create, maintain, and release a connection to that datastore.

Should I reuse RabbitMQ channel?

Don't open and close connections or channels repeatedly. Even channels should be long-lived if possible, e.g., reuse the same channel per thread for publishing. Don't open a channel each time you are publishing.


2 Answers

I have found this on the rabbitmq website it is near the bottom so I have quoted the relevant part below.

The tl;dr version is that you should have 1 connection per application and 1 channel per thread. Hope that helps.

Connections

AMQP connections are typically long-lived. AMQP is an application level protocol that uses TCP for reliable delivery. AMQP connections use authentication and can be protected using TLS (SSL). When an application no longer needs to be connected to an AMQP broker, it should gracefully close the AMQP connection instead of abruptly closing the underlying TCP connection.

Channels

Some applications need multiple connections to an AMQP broker. However, it is undesirable to keep many TCP connections open at the same time because doing so consumes system resources and makes it more difficult to configure firewalls. AMQP 0-9-1 connections are multiplexed with channels that can be thought of as "lightweight connections that share a single TCP connection".

For applications that use multiple threads/processes for processing, it is very common to open a new channel per thread/process and not share channels between them.

Communication on a particular channel is completely separate from communication on another channel, therefore every AMQP method also carries a channel number that clients use to figure out which channel the method is for (and thus, which event handler needs to be invoked, for example).

It is advised that there is 1 channel per thread, even though they are thread safe, so you could have multiple threads sending through one channel. In terms of your application I would suggest that you stick with 1 channel per thread though.

Additionally it is advised to only have 1 consumer per channel.

These are only guidelines so you will have to do some testing to see what works best for you.

This thread has some insights here and here.

Despite all these guidelines this post suggests that it will most likely not affect performance by having multiple connections. Though it is not specific whether it is talking about client side or server(rabbitmq) side. With the one point that it will of course use more systems resources with more connections. If this is not a problem and you wish to have more throughput it may indeed be better to have multiple connections as this post suggests multiple connections will allow you more throughput. The reason seems to be that even if there are multiple channels only one message goes through the connection at one time. Therefore a large message will block the whole connection or many unimportant messages on one channel may block an important message on the same connection but a different channel. Again resources are an issue. If you are using up all the bandwidth with one connection then adding an additional connection will have no increase performance over having two channels on the one connection. Also each connection will use more memory, cpu and filehandles, but that may well not be a concern though might be an issue when scaling.

like image 176
robthewolf Avatar answered Oct 18 '22 20:10

robthewolf


In addition to the accepted answer:

If you have a cluster of RabbitMQ nodes with either a load-balancer in front, or a short-lived DNS (making it possible to connect to a different rabbit node each time), then a single, long-lived connection would mean that one application node works exclusively with a single RabbitMQ node. This may lead to one RabbitMQ node being more heavily utilized than the others.

The other concern mentioned above is that the publishing and consuming are blocking operations, which leads to queueing messages. Having more connections will ensure that 1. processing time for each messages doesn't block other messages 2. big messages aren't blocking other messages.

That's why it's worth considering having a small connection pool (having in mind the resource concerns raised above)

like image 15
Bozho Avatar answered Oct 18 '22 18:10

Bozho