Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large RabbitMQ message in Slow network

I am using RabbitMQ with Spring AMQP

  • large message (>100MB, 102400KB)
  • small bandwidth (<512Kbps)
  • low heartbeat interval (10 seconds)
  • single broker

It will take >= 200*8 seconds to consume the message, which is more than my heartbeat interval. From https://stackoverflow.com/a/42363685/418439

If the message transfer time between nodes (60seconds?) > heartbeat time between nodes, it will cause the cluster to disconnect and the loose the message

  1. Will I also face the disconnection issue even I am using single broker?
  2. Does the heartbeat and consumer using the same thread, where if consumer is consuming, it is not possible to perform heartbeat?
  3. If so, what can I do to consume the message, without increase heartbeat interval or reduce my message size?

Update:

I have received another answer and comments after I posted my own answer. Thanks for the feedback. Just to clarify, I do not use AMQP for file transfer. Actually the data is in JSON message, some are simple and small but some contain complex information, include some free hand drawing. Besides saving the data at Data Center, we also save a copy of message at branch level via AMQP, for case connectivity to Data Center is not available.

like image 717
Lee Chee Kiam Avatar asked Dec 07 '22 17:12

Lee Chee Kiam


1 Answers

So, the real questions here are a bit more fundamental, and those are: (1) is it appropriate to perform a large file transfer via AMQP, and (2) what purpose does the heartbeat serve?

Heartbeats

First off, let's address the heartbeat question. As the RabbitMQ documentation clearly states, the purpose of the heartbeat is "to ensure that the application layer promptly finds out about disrupted connections."

The reason for this is simple. In an ordinary AMQP usage, there may be several seconds, even minutes between the arrival of successive messages. Without data being exchanged across a TCP session, many firewalls and other networking equipment automatically close ports to lower exposure to the enterprise network. Heartbeats further help mitigate a fundamental weakness in TCP, which is the difficulty of detecting a dropped connection. Networks experience failure, and TCP is not always able to detect that on its own.

So, the bottom line here is that, while you're transferring a large message, the connection is active and the heartbeat function serves no useful purpose, and can cause you trouble. It's best to turn it off in such cases.

AMQP For Moving Large Files?

The second issue, and I believe more important question, is how should large files be dealt with. To answer this, let's first consider what a message queue does: sending messages -- small bits of data which communicate something to another computer system. The operative word here is small. Messages typically contain one of three things: 1. commands (go do something), 2. events (something happened), 3. requests (give me some data), and 4. responses (here is your data). A full discussion on these is beyond the scope, but suffice it to say that each of these can generally be composed of a small message less than 100kB.

Indeed, the AMQP protocol, which underlies RabbitMQ, is a fairly chatty protocol. It requires large messages be divided into multiple segments of no more than 131kB. This can add a significant amount of overhead to a large file transfer, especially when compared to other file transfer mechanisms (FTP, for instance). Secondly, the message has to be fully processed by the broker before it is made available in a queue, and it ties up valuable resources on the broker while this is being done. For one, the whole message must fit into RAM on the broker due to its architecture. This solution may work for one client and one broker, but it will break quickly when scaling out is attempted.

Finally, compression is often desirable when transferring files - HTTP supports gzip compression automatcially. AMQP does not. It is quite common in message-oriented applications to send a message containing a resource locator (e.g. URL) pointing to the larger data file, which is then accessed via appropriate means.

The moral of the story

As the adage goes: "to the man with a hammer, everything looks like a nail." AMQP is not a hammer- it's a precision scalpel. It has a very specific purpose, and narrow applicability within that purpose. Using it for something other than its intended purpose will lead to stability and reliability problems in whatever it is you are designing, and overall dissatisfaction with your end product.

like image 114
theMayer Avatar answered Mar 04 '23 10:03

theMayer