Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Event-sourcing: Dealing with derived data

How does an event-sourcing system deal with derived data? All the examples I've read on event-sourcing demonstrate services reacting to fact events. A popular example seems to be:

Bank Account System

Events

  • Funds deposited
  • Funds withdrawn

Services

  • Balance Service

They then show how the Balance service can, at any point, derive a state (I.e. balance) from the events. That makes sense; those events are facts. There's no question that they happened - they are external to the system.

However, how do we deal with data calculated BY the system?

E.g.

Overdrawn service:

A services which is responsible for monitoring the balance and performing some action when it goes below zero.

Does the event-sourcing approach dictate how we should use (or not use) derived data? I.e. The balance. Perhaps one of the following?

1) Use: [Funds Withdrawn event] + [Balance service query]

Listen for the "Funds withdrawn" event and then ask the Balance service for the current balance.

2) Use: [Balance changed event]

Get the balance service to throw a "Balance changed" event containing the current balance. Presumably this isn't a "fact" as it's not external to the system, therefore prone to miscalculation.

3) Use: [Funds withdrawn event] + [Funds deposited event]

We could just skip the Balance service and have each service maintain its own balance directly from the facts. ...though that would result in each service having its own (potentially different) version of the balance.

like image 971
Andy N Avatar asked Jan 19 '16 16:01

Andy N


People also ask

What database is used for Event Sourcing?

In summary, we can say that RDBMS can easily be used as an event store, since it satisfies all the requirements natively. MongoDB is one of the most popular among the NoSQL databases. Its main characteristics include high scalability via sharding and schema-less storage.

What is the Event Sourcing pattern?

Solution. The Event Sourcing pattern defines an approach to handling operations on data that's driven by a sequence of events, each of which is recorded in an append-only store.

What is the difference between CQRS and Event Sourcing?

CQRS is implemented by a separation of responsibilities between commands and queries, and event sourcing is implemented by using the sequence of events to track changes in data.

What is the difference between event driven and Event Sourcing?

Event Sourcing is keeping a log for your own use so you don't forget. Event Driven Architecture is about communicating what happened to others. Typically, components in EDA can't recover the entirety of their state from the events they've published because not everything that changed their state is worth publishing.


2 Answers

A services which is responsible for monitoring the balance and performing some action when it goes below zero.

Executive summary: the way this is handled in event sourced systems is not actually all that different from the alternatives.

Stepping back a second - the advantage of having a domain model is to ensure that all proposed changes satisfy the business rules. Borrowing from the CQRS language: we send command messages to a command handler. The handler loads the state of the model, and tries to apply the command. If the command is allowed, the changes to the state of the domain model is updated and saved.

After persisting the state of the model, the command handler can query that state to determine if their are outstanding actions to be performed. Udi Dahan describes this in detail in his talk on Reliable messaging.

So the most straight forward way to describe your service is one that updates the model each time the account balance changes, and sets the "account overdrawn" flag if the balance is negative. After the model is saved, we schedule any actions related to that state.

Part of the justification for event sourcing is that the state of the domain model is derivable from the history. Which is to say, when we are trying to determine if the model allows a command, we load the history, and compute from the history the current state, and then use that state to determine whether the command is permitted.

What this means, in practice, is that we can write an AccountOverdrawn event at the same time that we write the AccountDebited event.

That AccountDebited event can be subscribed to - Pub/Sub. The typical handling is that the new events get published after they are successfully written to the book of record. An event listener subscribing to the events coming out of the domain model observes the event, and schedules the command to be run.

Digression: typically, we'll want at-least-once execution of these activities. That means keeping track of acknowledgements.

Therefore, the event handler is also a thing with state. It doesn't have any business state in it, and certainly no rules that would allow it to reject events. What it does track is which events it has seen, and which actions need to be scheduled. The rules for loading this event handler (more commonly called a process manager) are just like those of the domain model - load events from the book of record to obtain the current state, then see if the event being handled changes anything.

So it is really subscribing to two events - the AccountDebited event, and whatever event returns from the activity to acknowledge that it has completed.

This same mechanic can be used to update the domain model in response to events from elsewhere.

Example: suppose we get a FundsWithdrawn event from an ATM, and we need to update the account history to match it. So our event handler gets loaded, updates itself, and schedules a RecordATMWithdrawal command to be run. When the command loads, it loads the account, updates the balances, and writes out the AccountCredited and AccountOverdrawn events as before. The event handler sees these events, loads the correct state process state based on the meta data, and updates the state of the process.

In CQRS terms, this is all taking place in the "write models"; these processes are all about updating the book of record.

The balance query itself is easy - we already showed that the balance can be derived from the history of the domain model, and that's just how your balance service is expected to do it.

To sum up; at any given time you can load the history of the domain model, to query its state, and you can load up the history of the event processor, to determine what work has yet to be acknowledged.

like image 143
VoiceOfUnreason Avatar answered Oct 23 '22 04:10

VoiceOfUnreason


Event sourcing is an evolving discipline with a bunch of diverse practices, practitioners and charismatic people. You can't expect them to provide you with some very consistent modelling technique for all scenarios like you described. Each one of those scenarios has it's pros and cons and you specified some of them. Also it may vary dramatically from one project to another, because business requirements (evolutionary pressures of the market) will be different.

If you are working on some mission-critical system and you want to have very consistent balance all the time - it's better to use RDBMS and ACID transactions.

If you need maximum speed and you are okay with eventually consistent states and not very anxious about precision of your balances (some events may be missing here and there for bunch of reasons) then you can derive your projections for balances from events asynchronously.

In both scenarios you can use event sourcing, but you don't necessarily have to generate your projections asynchronously. It's okay to generate projection in the same transaction scope as you making changes to your write model if you really need to do that.

Will it make Greg Young happy? I have no idea, but who cares about such things if your balances one day may go out of sync in mission-critical system ...

like image 34
IlliakaillI Avatar answered Oct 23 '22 04:10

IlliakaillI