Kafka Stream: output to a topic first or persist directly?

Tags:

A fair number of articles describe implementing the use of Kafka Streams where they output to a new Kafka topic instead of saving to some sort of distributed database.

Is this just a common use case, making the assumption that the embedded db + interactive queries is sufficient, or is there some architectural reason why one would want to output a topic before consuming it again to persist it, instead of persisting directly?

I'm not sure if it makes a difference, but the context of the examples I'm looking at is for tumbling time-windowed aggregation.

334

asked Jun 18 '17 19:06

Asmodean

1 Answers

If all you want is to take data out of kafka and store it in a db, then Kafka Connect is the most natural way to go.

On the other hand if your primary use-case is doing aggregation, then indeed Kafka Streams is an easy and elegant way to go about it. And if a Kafka Connect sink already exists for your preferred database, then it will be most straightforward to have Kafka Streams write output to a topic and then have that Kafka Connect sink pick it up and store in your db. If no out-of-the-box sink exists and you have to write it and you don't think it would be reusable enough, then you might choose to just write it as a custom Kafka Streams processor and not have an output Kafka topic.

As you can see there are various ways to go depending on your use-case and your preferences. There is no one correct way, so please consider the trade-offs involved.

178

answered Nov 15 '22 07:11

Michal Borowiecki

Related questions
                            
                                How to stream large files through Kafka?
                            
                                What happens to persistent volume if the StatefulSet got deleted and re-created?
                            
                                Kafka MirrorMaker 2.0 duplicate each messages
                            
                                What is the difference between MANUAL and MANUAL_IMMEDIATE in spring-kafka AckMode
                            
                                Kafka: what is the point of using "acknowledgment.nack" if I can simply "not acknowledgment.acknowledge"
                            
                                How to remove a Kafka consumer group from a specific topic?
                            
                                Can't start Kafka Connect with MongoDb plugin with Apache Kafka
                            
                                Is event sourcing an enhanced pattern of choreography-based SAGA pattern?
                            
                                Interactive admin shell for Apache Kafka
                            
                                Kafka - Consumer group creation with specific offset?
                            
                                Direct Kafka Stream with PySpark (Apache Spark 1.6)
                            
                                kafka 0.9.0.1 fails to start with fatal exception
                            
                                WARN Error while fetching metadata with correlation id 1 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)
                            
                                Kafka topic partition and Spark executor mapping
                            
                                get topic from kafka message in spark
                            
                                Error reading field 'topics': java.nio.BufferUnderflowException in Kafka
                            
                                Filebeat 5.0 output to Kafka multiple topics
                            
                                Apache Kafka create topic from code [duplicate]
                            
                                Running Kafka cluster in Docker containers?
                            
                                Why using apache kafka in real-time processing

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Kafka Stream: output to a topic first or persist directly?

Tags:

apache-kafka

persistence

apache-kafka-streams

Asmodean

People also ask

1 Answers

Michal Borowiecki

Recent Activity

Donate For Us