Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to store events in a distributed system which uses event sourcing?

Given you have multiple systems, which are integrated by events, and all of them are using event sourcing. Where do you store the events?

In my case I have three systems:

  • A website, which is a shop
  • A backend for the Website to manage customers, products etc.
  • An accounting system

Whenever a domain event happens in one of those systems the event is published and can be processed by the other systems. All systems are using event sourcing.

I am wondering where you would save the events. Of course each system has to store all events that it processed because it is using event sourcing and therefore depends on the events it once processed.

But what about the other events that where not needed and therefore the system did not subscribe to? I am struggling with the fact that requirements can change, such that a system would have to process events from the past that it did not persist. Where would you get these events from, if the system needed to process events that it did not subscribe when they occured?

I think there is a big difference to systems that do not use event sourcing at this point. If you have to implement a feature in a system A which depends on data, that is not available in A, but in another system B, and you persistent current state via a ORM tool like NHibernate you can simply import that data from A to B. Since a system, that uses event sourcing, depends on events to get to it's current state you have to import all the events that you missed in the past but are need now.

For me there are a few different approaches to this problem.

  1. Each system saves all events that is publishes. This gives you the ability to republish the events if needed or to import them into another system.
  2. Each system saves all events that happen, even those which do not need to be processed (yet).
  3. All events from all system are stored in central event log. If you need to proccess a event that happened in the past but you did not subscribe to you can import it from here.

How do you handle such a situation? Where do you save your events?

Edit

Thanks Roy Dictus for your answer. I'm still not sure how to handle the following situation:

The website publishes the events CustomerRegistered, CustomerPurchasedProduct and CustomerMarkedProductAsFavorite. In the current version of the backend customers haave to be displayed and their purchases have to be displayed. What a customer marked as a favorite is not of interest in that version of the system. Thus the backend only subscribed to CustomerRegistered and CustomerPurchasedProduct.

Now the marketing department also wants the information about the favorite products to be shown on the customer details page. Since the backend didn't subscribe to CustomerMarkedProductAsFavorite this information is not available in the backend. Where do I get that information from?

like image 267
Alebo Avatar asked Jun 04 '11 18:06

Alebo


People also ask

Which is responsible for saving the events for Event Sourcing?

Instead of saving latest status of data into database, Event Sourcing pattern offers to save all events into database with sequential ordered of data events. This events database called event store. Instead of updating the status of a data record, it append each change to a sequential list of events.

What database is used for Event Sourcing?

On the technical level, event sourcing can be implemented using dedicated storage systems, as well as general-purpose "NoSQL" and SQL databases. If you're interested in the origins of event sourcing, the articles by Greg Young on event sourcing and CQRS are a great place to start.

Does Kafka store events?

TL;DR: Kafka is not an event store; rather, it is an enabler for building event stores. For the most trivial of use cases, Kafka's support for unbounded retention, combined with per-entity record keying, tombstones and topic compaction may be used to build a very rudimentary event store.


1 Answers

  1. Each system stores its own events. Each system is its own CQRS system, or at least its own self-contained service, and therefore is responsible for its own data.
  2. Each system also publishes its event to a service bus. This service bus determines where it saves these events. Usually it is in a transactional queuing system.
  3. Each system subscribes to the outside events it consumes. It does not store these incoming events, only its own events that result from them. When it consumes an incoming event, the service bus knows it can delete the event from that service's incoming queue.

EDIT to accommodate your extra question:

If another application suddenly becomes interested in extra information, it has to add listeners to the events it is now interested in.

Furthermore, all sources of these events can then replay those events. Replay is a powerful feature of event-driven systems that allows for such scenarios. So, the event sources replay only the selected events (say, all CustomerMarkedItemAsFavorite events of the last 6 months). Systems that have already consumed these events should recognize that the events replayed are "old" ones (i.e., ones that it has already processed) and ignore them.

This way, any subsystem that is updated to use extra information from the other subsystems can get that information and get all up-to-date in a single batch operation.

like image 150
Roy Dictus Avatar answered Oct 21 '22 08:10

Roy Dictus