I would like to ask if my understanding of Kafka is correct.
For really really big data stream, conventional database is not adequate so people use things such as Hadoop or Storm. Kafka sits on top of said databases and provide ...directions where the real time data should go?
Kafka is primarily used to build real-time streaming data pipelines and applications that adapt to the data streams. It combines messaging, storage, and stream processing to allow storage and analysis of both historical and real-time data.
By dividing partition assignments, Kafka can parallelize the process of reading data by consuming applications. There's a catch. Kafka can only assign a single partition to at most one consumer (but one consumer can get many partitions).
The purpose of APIs is to essentially provide a way to communicate between different services, development sides, microservices, etc. The REST API is one of the most popular API architectures out there. But when you need to build an event streaming platform, you use the Kafka API.
Essentially, it consumes data streams from various Kafka topics and is able to process or transform this as needed. Post-processing, this data stream is published to another Kafka topic to be used downstream and/or transform an existing topic.
I don't think so.
Kafka is messaging system and it does not sit on top of database.
You can compare Kafka with messaging systems like ActiveMQ, RabbitMQ etc.
From Apache documentation page
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.
Key takeaways:
Communication between the clients and the servers is done with a simple, high-performance, language agnostic TCP protocol.
Use Cases:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With