Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing sagas with Kafka

I am using Kafka for Event Sourcing and I am interested in implementing sagas using Kafka.

Any best practices on how to do this? The Commander pattern mentioned here seems close to the architecture I am trying to build but sagas are not mentioned anywhere in the presentation.

like image 400
George Avatar asked May 08 '17 10:05

George


2 Answers

I would like to add something here about sagas and Kafka.

In general


In general Kafka is a tad different than a normal queue. It's especially good in scaling. And this actually can cause some complications.

One of the means to accomplish scaling, Kafka uses partitioning of the data stream. Data is placed in partitions, which can be consumed at its own rate, independent of the other partitions of the same topic. Here is some info on it: how-choose-number-topics-partitions-kafka-cluster. I'll come back on why this is important.

The most common ways to ensure the order within Kafka are:

  • Use 1 partition for the topic
  • Use a partition message key to "assign" the message to a topic

In both scenarios your chronologically dependent messages need to stream through the same topic.

Also, as @pranjal thakur points out, make sure the delivery method is set to "exactly once", which has a performance impact but ensures you will not receive the messages multiple times.

The caveat


Now, here's the caveat: When changing the amount of partitions the message distribution over the partitions (when using a key) will be changed as well.

In normal conditions this can be handled easily. But if you have a high traffic situation, the migration toward a different number of partitions can result in a moment in time in which a saga-"flow" is handled over multiple partitions and the order is not guaranteed at that point.

It's up to you whether this will be an issue in your scenario.

Here are some questions you can ask to determine if this applies to your system:

  • What will happen if you somehow need to migrate/copy data to a new system, using Kafka?
    (high traffic scenario)
  • Can you send your data to 1 topic?
  • What will happen after a temporary outage of your saga service?
    (low availability scenario/high traffic scenario)
  • What will happen when you need to replay a bunch of messages?
    (high traffic scenario)
  • What will happen if we need to increase the partitions?
    (high traffic scenario/outage & recovery scenario)

The alternative


If you're thinking of setting up a saga, based on steps, like a state machine, I would challenge you to rethink your design a bit.

I'll give an example:

Lets consider a booking-a-hotel-room process:

Simplified, it might consist of the following steps:

  • Handle room reserved (incoming event)
  • Handle room payed (incoming event)
  • Send acknowledgement of the booking (after payed and some processing)

Now, if your saga is not able to handle the payment if the reservation hasn't come in yet, then you are relying on the order of events.

In this case you should ask yourself: when will this break?


If you conclude you want to avoid the chronological dependency; consider a system without a saga, or a saga which does not depend on the order of events - i.e.: accepting all messages, even when it's not their turn yet in the process.

Some examples:

  • aggregators
  • Modeled as business process: parallel gateways (parallel process flows)

Do note in such a setup it is even more crucial that every action has got an implemented compensating action (rollback action).

I know this is often hard to accomplish; but, if you start small, you might start to like it :-)

like image 129
Stefan Avatar answered Dec 11 '22 07:12

Stefan


This talk from this year's DDD eXchange is the best resource I came across wrt Process Manager/Saga pattern in event-driven/CQRS systems: https://skillsmatter.com/skillscasts/9853-long-running-processes-in-ddd (requires registering for a free account to view)

The demo shown there lives on github: https://github.com/flowing/flowing-retail

I've given it a spin and I quite like it. I do recommend watching the video first to set the stage.

Although the approach shown is message-bus agnostic, the demo uses Kafka for the Process Manager to send commands to and listen to events from other bounded contexts. It does not use Kafka Streams but I don't see why it couldn't be plugged into a Kafka Streams topology and become part of the broader architecture like the one depicted in the Commander presentation you referenced.

I hope to investigate this further for our own needs, so please feel free to start a thread on the Kafka users mailing list, that's a good place to collaborate on such patterns.

Hope that helps :-)

like image 45
Michal Borowiecki Avatar answered Dec 11 '22 08:12

Michal Borowiecki