I'm trying to implement a simple CQRS/event sourcing proof of concept on top of Kafka streams (as described in https://www.confluent.io/blog/event-sourcing-using-apache-kafka/) I have 4 basic parts: <ol> <li> <code>commands</code> topic, which uses the aggregate ID as the key for sequential processing of commands per aggregate</li> <li> <code>events</code> topic, to which every change in aggregate state are published (again, key is the aggregate ID). This topic has a retention policy of "never delete"</li> <li> A KTable to reduce aggregate state and save it to a state store <pre class="prettyprint"> events topic stream -> group to a Ktable by aggregate ID -> reduce aggregate events to current state -> materialize as a state store </pre> </li> <li>commands processor - commands stream, left joined with aggregate state KTable. For each entry in the resulting stream, use a function <code>(command, state) => events</code> to produce resulting events and publish them to the <code>events</code> topic</li> </ol> The question is - is there a way to make sure I have the latest version of the aggregate in the state store? I want to reject a command if violates business rules (for example - a command to modify the entity is not valid if the entity was marked as deleted). But if a <code>DeleteCommand</code> is published followed by a <code>ModifyCommand</code> right after it, the delete command will produce the <code>DeletedEvent</code>, but when the <code>ModifyCommand</code> is processed, the loaded state from the state store might not reflect that yet and conflicting events will be published. I don't mind sacrificing command processing throughput, I'd rather get the consistency guarantees (since everything is grouped by the same key and should end up in the same partition) Hope that was clear :) Any suggestions?

I don't think Kafka is good for CQRS and Event sourcing yet, the way you described it, because it lacks a (simple) way of ensuring protection from concurrent writes. This article talks about this in details. What I mean by the way you described it is the fact that you expect a command to generate zero or more events or to fail with an exception; this is the classical CQRS with Event sourcing. Most of the people expect this kind of Architecture. You could have Event sourcing however in a different style. Your Command handlers could yield events for every command that is received (i.e. <code>DeleteWasAccepted</code>). Then, an Event handler could eventually handle that Event in an Event sourced way (by rebuilding Aggregate's state from its event stream) and emit other Events (i.e. <code>ItemDeleted</code> or <code>ItemDeletionWasRejected</code>). So, commands are fired-and-forget, sent async, the client does not wait for an immediate response. It waits however for an Event describing the outcome of its command execution. An important aspect is that the Event handler must process events from the same Aggregate in a serial way (exactly once and in order). This can be implemented using a single Kafka Consumer Group. You can see about this architecture in this video.

Event sourcing with Kafka streams

Tags:

apache-kafka

event-sourcing

apache-kafka-streams

I'm trying to implement a simple CQRS/event sourcing proof of concept on top of Kafka streams (as described in https://www.confluent.io/blog/event-sourcing-using-apache-kafka/)

I have 4 basic parts:

commands topic, which uses the aggregate ID as the key for sequential processing of commands per aggregate
events topic, to which every change in aggregate state are published (again, key is the aggregate ID). This topic has a retention policy of "never delete"

A KTable to reduce aggregate state and save it to a state store

events topic stream ->
group to a Ktable by aggregate ID ->
reduce aggregate events to current state ->
materialize as a state store

commands processor - commands stream, left joined with aggregate state KTable. For each entry in the resulting stream, use a function (command, state) => events to produce resulting events and publish them to the events topic

The question is - is there a way to make sure I have the latest version of the aggregate in the state store?

I want to reject a command if violates business rules (for example - a command to modify the entity is not valid if the entity was marked as deleted). But if a DeleteCommand is published followed by a ModifyCommand right after it, the delete command will produce the DeletedEvent, but when the ModifyCommand is processed, the loaded state from the state store might not reflect that yet and conflicting events will be published.

I don't mind sacrificing command processing throughput, I'd rather get the consistency guarantees (since everything is grouped by the same key and should end up in the same partition)

Hope that was clear :) Any suggestions?

765

asked Mar 21 '18 18:03

amitayh

1 Answers

I don't think Kafka is good for CQRS and Event sourcing yet, the way you described it, because it lacks a (simple) way of ensuring protection from concurrent writes. This article talks about this in details.

What I mean by the way you described it is the fact that you expect a command to generate zero or more events or to fail with an exception; this is the classical CQRS with Event sourcing. Most of the people expect this kind of Architecture.

You could have Event sourcing however in a different style. Your Command handlers could yield events for every command that is received (i.e. DeleteWasAccepted). Then, an Event handler could eventually handle that Event in an Event sourced way (by rebuilding Aggregate's state from its event stream) and emit other Events (i.e. ItemDeleted or ItemDeletionWasRejected). So, commands are fired-and-forget, sent async, the client does not wait for an immediate response. It waits however for an Event describing the outcome of its command execution.

An important aspect is that the Event handler must process events from the same Aggregate in a serial way (exactly once and in order). This can be implemented using a single Kafka Consumer Group. You can see about this architecture in this video.

159

answered Oct 31 '22 00:10

Constantin Galbenu

Related questions
                            
                                How does Zookeeper/Kafka retain offset for a consumer?
                            
                                How to create KSQL Stream with large number of JSON fields from topic in kafka?
                            
                                Accessing Local Kafka from within Services deployed in Local Docker For Mac (incl. Kubernetes extension)
                            
                                Apache Kafka Connect With Springboot
                            
                                Kafka Consumer Assignment returns Empty Set
                            
                                Kafka cluster with single broker
                            
                                Configuring listener with "localhost" causes a failure to retrieve meta data about the broker [duplicate]
                            
                                Not able to create kafka topic using docker-compose
                            
                                Kafka-topics --list using ssl
                            
                                How to create a new consumer group in kafka
                            
                                kafka s3 sink connector crashed when It gets NULL data
                            
                                Apache Kafka Java Classes?
                            
                                Why can't Kafka Producer connect to zookeeper to fetch broker metadata instead of connecting to brokers
                            
                                unable to set 'max.poll.records' under kafka consumer, where cons.poll still returns all records under partition
                            
                                How to delete Kafka topic using Kafka REST Proxy?
                            
                                How customer offsets are maintained in mirrored cluster in Kafka?
                            
                                How to pass multiple bootstrap servers for listener using spring-kafka
                            
                                How to Process a kafka KStream and write to database directly instead of sending it another topic
                            
                                Read json from Kafka and write json to other Kafka topic
                            
                                Use of producer.properties and consumer.properties file in Apache Kafka

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With