What is the difference between Kafka topic and stream? I was thinking both were same.
This doc says that create stream from a topic
which caused the confusion.
https://docs.ksqldb.io/en/latest/developer-guide/create-a-stream/
Questions:
The topic is the most important abstraction provided by Kafka: it is a category or feed name to which data is published by producers. Every topic in Kafka is split into one or more partitions. Kafka partitions data for storing, transporting, and replicating it. Kafka Streams partitions data for processing it.
A topic is basically a stream of data, just like how you have tables in databases, you have topics in Kafka. Topics are then split into what are called partitions. So, a partition is basically a part of the topic and the data within the partition is ordered.
A stream is a flow of data, whether it is from a single topic or collection of topics. There is also a method with stream(Collection<String> topics) which means that a stream is not just confined to a single topic. When topic gives us the stream of events, what is the need for us to create stream from a topic?
Kafka is primarily used to build real-time streaming data pipelines and applications that adapt to the data streams. It combines messaging, storage, and stream processing to allow storage and analysis of both historical and real-time data.
- What is the difference between Kafka topic and stream?
A. A stream is a flow of data, whether it is from a single topic or collection of topics.
There is also a method with stream(Collection<String> topics)
which means that a stream is not just confined to a single topic.
- When topic gives us the stream of events, what is the need for us to create stream from a topic?
A. Stream is the basic entity in Kafka streams. A stream goes through a set of processors. The term stream is used in the context of Kafka streams. Kafka streams internally creates a consumer which consumes the topic(s).
Again, as said earlier, a stream can also be a collection of topics. So, sometimes if you want to consume different topics and process them, then you need to create a stream for those topics.
- Can we create table from topic directly? Or should we create stream first to create table?
A. Yes, it is possible to create a table from a topic directly using both the Kafka clients API as well as the Kafka streams API.
If you are using Kafka streams in your application, then you can use StreamsBuilder#table()
or StreamsBuilder#globalTable()
methods.
If you are using Kafka clients API, then you have to manually consume the topic and populate the messages in a map or in some other data structure.
Kafka streams is used when there are topologies. For simple applications, where we just consume, process and commit without multiple process stages, then Kafka clients API should be good enough. Whatever that can be achieved through Kafka streams can be achieved through Kafka clients also.
Kafka streams basically makes things relatively simple for complex workflows, but it can also be used for simple workflows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With