Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka - Stream vs Topic

What is the difference between Kafka topic and stream? I was thinking both were same. This doc says that create stream from a topic which caused the confusion.

https://docs.ksqldb.io/en/latest/developer-guide/create-a-stream/

Questions:

  1. What is the difference between Kafka topic and stream?
  2. When topic gives us the stream of events, what is the need for us to create stream from a topic?
  3. Can we create table from topic directly? Or should we create stream first to create table?
like image 741
RamPrakash Avatar asked Jun 12 '20 02:06

RamPrakash


People also ask

What is the difference between Kafka topic and Kafka stream?

The topic is the most important abstraction provided by Kafka: it is a category or feed name to which data is published by producers. Every topic in Kafka is split into one or more partitions. Kafka partitions data for storing, transporting, and replicating it. Kafka Streams partitions data for processing it.

Is a Kafka topic a stream?

A topic is basically a stream of data, just like how you have tables in databases, you have topics in Kafka. Topics are then split into what are called partitions. So, a partition is basically a part of the topic and the data within the partition is ordered.

What is difference between topic and stream?

A stream is a flow of data, whether it is from a single topic or collection of topics. There is also a method with stream(Collection<String> topics) which means that a stream is not just confined to a single topic. When topic gives us the stream of events, what is the need for us to create stream from a topic?

What is Kafka streaming used for?

Kafka is primarily used to build real-time streaming data pipelines and applications that adapt to the data streams. It combines messaging, storage, and stream processing to allow storage and analysis of both historical and real-time data.


1 Answers

  1. A topic is a collection of partitions where each partition will contain some messages. A partition is actually a directory on the disk.
  1. What is the difference between Kafka topic and stream?

A. A stream is a flow of data, whether it is from a single topic or collection of topics. There is also a method with stream(Collection<String> topics) which means that a stream is not just confined to a single topic.

  1. When topic gives us the stream of events, what is the need for us to create stream from a topic?

A. Stream is the basic entity in Kafka streams. A stream goes through a set of processors. The term stream is used in the context of Kafka streams. Kafka streams internally creates a consumer which consumes the topic(s).

Again, as said earlier, a stream can also be a collection of topics. So, sometimes if you want to consume different topics and process them, then you need to create a stream for those topics.

  1. Can we create table from topic directly? Or should we create stream first to create table?

A. Yes, it is possible to create a table from a topic directly using both the Kafka clients API as well as the Kafka streams API.

If you are using Kafka streams in your application, then you can use StreamsBuilder#table() or StreamsBuilder#globalTable() methods.

If you are using Kafka clients API, then you have to manually consume the topic and populate the messages in a map or in some other data structure.

Kafka streams is used when there are topologies. For simple applications, where we just consume, process and commit without multiple process stages, then Kafka clients API should be good enough. Whatever that can be achieved through Kafka streams can be achieved through Kafka clients also.

Kafka streams basically makes things relatively simple for complex workflows, but it can also be used for simple workflows.

like image 168
JavaTechnical Avatar answered Oct 03 '22 21:10

JavaTechnical