Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RabbitMQ and Delivery Guarantees in Distributed Database Transaction

I am trying to understand what is the right pattern to deal with RabbitMQ deliveries in the context of distributed database transaction.

To make this simple, I will illustrate my ideas in pseudocode, but I'm in fact using Spring AMQP to implement these ideas.

Anything like

void foo(message) {
   processMessageInDatabaseTransaction(message);
   sendMessageToRabbitMQ(message);
}

Where by the time we reach sendMessageToRabbitMQ() the processMessageInDatabaseTransaction() has successfully committed its changes to the database, or an exception has been thrown before reaching the message sending code.

I know that for the sendMessageToRabbitMQ() I can use Rabbit transactions or publisher confirms to guarantee that Rabbit got my message.

My interest is understanding what should happen when things go south, i.e. when the database transaction succeeded, but the confirmation does not arrive after certain amount of time (with publisher confirms) or the Rabbit transaction fails to commit (with Rabbit transaction).

Once that happens, what is the right pattern to guarantee delivery of my message?

Of course, having developed idempotent consumers, I have considered that I could retry the sending of the messages until Rabbit confirms success:

void foo(message) {
   processMessageInDatabaseTransaction(message);
   retryUntilSuccessFull {
      sendMessagesToRabbitMQ(message);
   }
}

But this pattern has a couple of drawbacks I dislike, first, if the failure is prolonged, my threads will start to block here and my system will eventually become unresponsive. Second, what happens if my system crashes or shuts down? I will never deliver these messages then since they will be lost.

So, I thought, well, I will have to write my messages to the database first, in pending status, and then publish my pending messages from there:

void foo(message) {
   //transaction commits leaving message in pending status
   processMessageInDatabaseTransaction(message);
}

@Poller(every="10 seconds")
void bar() {
   for(message in readPendingMessagesFromDbStore()) {
      sendPendingMessageToRabbitMQ(message);
      if(confirmed) {
          acknowledgeMessageInDatabase(message); 
      }
   }
}

Possibly sending the messages multiple times if I fail to acknowledge the message in my database.

But now I have introduced other problems:

  • The need to do I/O from the database to publish a message that 99% time would have successfully being published immediately without having to check the database.
  • The difficulty of making the poller closer to real time delivery since now I have added latency to the publication of the messages.
  • And perhaps other complications like guarantee delivery of events in order, poller executions stepping into one another, multiple pollers, etc.

And then I thought well, I could make this a bit more complicated like, I can publish from the database until I catch up with the live stream of events and then publish real time, i.e. maintain a buffer of size b (circular buffer) as I read based on pages check if that message is in buffer. If so then switch to live subscription.

To this point I realized that how to do this right is not exactly evident and so I concluded that I need to learn what are the right patterns to solve this problem.

So, does anyone has suggestions on what is the right ways to do this correctly?

like image 432
Edwin Dalorzo Avatar asked Feb 21 '17 16:02

Edwin Dalorzo


2 Answers

While RabbitMQ cannot participate in a truly global (XA) transaction, you can use Spring Transaction management to synchronize the Database transaction with the Rabbit transaction, such that if either update fails, both transactions will be rolled back. There is a (very) small timing hole where one might commit but not the other so you do need to deal with that possibility.

See Dave Syer's Javaworld Article for more details.

like image 141
Gary Russell Avatar answered Oct 08 '22 05:10

Gary Russell


When Rabbit fails to receive a message (for whatever reason, but in my experience only because the service is down or unavailable) you should be in a position to catch an error. At this point, you can make a record of that - and any subsequent - failed attempt in order to retry when Rabbit becomes available again. The quickest way of doing this is just logging the message details to file, and iterating over to re-send when appropriate.

As long as you have that file, you've not lost your messages.

Once messages are inside Rabbit, and you have faith in the rest of the architecture, it should be safe to assume that messages will end up where they are supposed to be, and that no further persistence work needs doing at your end.

like image 2
HomerPlata Avatar answered Oct 08 '22 04:10

HomerPlata