Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RabbitMQ: fast producer and slow consumer

Tags:

I have an application that uses RabbitMQ as the message queue to send/receive message between two components: sender and receiver. The sender sends message in a very fast way. The receiver receives the message and then does some very time-consuming task (mainly database writing for very large data size). Since the receiver takes a very long time to finish the task and then retrieve the next message in the queue, the sender will keep filling up the queue quickly. So my question is: Will this cause the message queue to overflow?

The message consumer looks like the following:

public void onMessage() throws IOException, InterruptedException {
    channel.exchangeDeclare(EXCHANGE_NAME, "fanout");
    String queueName = channel.queueDeclare("allDataCase", true, false, false, null).getQueue();
    channel.queueBind(queueName, EXCHANGE_NAME, "");

    QueueingConsumer consumer = new QueueingConsumer(channel);
    channel.basicConsume(queueName, true, consumer);

    while (true) {
        QueueingConsumer.Delivery delivery = consumer.nextDelivery();
        String message = new String(delivery.getBody());
        System.out.println(" [x] Received '" + message + "'");

        JSONObject json = new JSONObject(message);
        String caseID = json.getString("caseID");
        //following takes very long time            
        dao.saveToDB(caseID);
    }
}

Each message received by the consumer contains a caseID. For each caseID, it will save large amount of data to the database, which takes very long time. Currently only one consumer is set up for the RabbitMQ since producer/consumer use the same queue for the publish/subscribe of caseID. So how can I speed up the consumer throughput so that the consumer can catch up with the producer and avoid the message overflow in the queue? Should I use multithreading in the consumer part to speed up the consumption rate? Or should I use multiple consumers to consume the incoming message simutaneously? Or is there any asynchronous way to let the consumer consume the message asynchronously without waiting it to finish? Any suggestions are welcome.

like image 518
tonga Avatar asked Oct 28 '14 20:10

tonga


People also ask

What is producer and consumer in RabbitMQ?

A producer is a user application that sends messages. A queue is a buffer that stores messages. A consumer is a user application that receives messages.

Can RabbitMQ have multiple consumers?

RabbitMQ has a plugin for consistent hash exchange. Using that exchange, and one consumer per queue, we can achieve message order with multiple consumers. The hash exchange distributes routing keys among queues, instead of messages among queues. This means all messages with the same routing key will go the same queue.

How many messages can RabbitMQ handle per second?

The RabbitMQ message broker was deployed atop Google Compute Engine where it demonstrated the ability to receive and deliver more than one million messages per second (a sustained combined ingress/egress of over two million messages per second).


2 Answers

"Will this cause the message queue to overflow?"

Yes. RabbitMQ will enter a state of "flow control" to prevent excessive memory consumption as the queue length increases. It will also start persisting messages to disk, rather than hold them in memory.

"So how can I speed up the consumer throughput so that the consumer can catch up with the producer and avoid the message overflow in the queue"

You have 2 options:

  1. Add more consumers. Bear in mind that your DB will now be manipulated by multiple concurrent processes if you choose this option. Ensure that the DB can withstand the extra pressure.
  2. Increase the QOS value of the consuming channel. This will pull more messages from the queue and buffer them on the consumer. This will increase the overall processing time; if 5 messages are buffered, the 5th message will take the processing time of messages 1...5 to complete.

"Should I use multithreading in the consumer part to speed up the consumption rate?"

Not unless you have a well-designed solution. Adding parallelism to an application is going to add a lot of overhead on the consumer-side. You may end up exhausting the ThreadPool or throttling memory-usage.

When dealing with AMQP, you really need to consider the business requirement for each process in order to design the optimal solution. How time-sensitive are your incoming messages? Do they need to be persisted to DB ASAP, or does it matter to your users whether or not that data is available immediately?

If the data does not need to be persisted immediately, you could modify your application so that the consumer(s) simply remove messages from the queue and save them to a cached collection, in Redis, for example. Introduce a second process which then reads and processes the cached messages sequentially. This will ensure that your queue-length does not grow sufficiently to result in flow-control, while preventing your DB from being bombarded with write requests, which are typically more expensive than read requests. Your consumer(s) now simply remove messages from the queue, to be dealt with by another process later.

like image 109
Paul Mooney Avatar answered Oct 31 '22 19:10

Paul Mooney


You have lot of ways to increase your performance.

  1. You can create a worker queue with more producers, in this way you create an simple load-balance system. don't use exchange---> queue but only queue. Read this post RabbitMQ Non-Round Robin Dispatching

  2. When you get a message you can create a poolthread for insert the data on your database, but in this case you have to manage the failure.

But I think the principal problem is the database and not RabbitMQ. With a good tuning, multi-threading and worker queue you can have a scalable and fast solution.

Let me know

like image 21
Gabriele Santomaggio Avatar answered Oct 31 '22 17:10

Gabriele Santomaggio