Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which directory does apache kafka store the data in broker nodes

Tags:

apache-kafka

I can see a property in config/server.properties called log.dir? Does this mean kafka uses the same directory for storing logs and data both?

like image 339
Midhun Mathew Sunny Avatar asked Nov 01 '16 21:11

Midhun Mathew Sunny


People also ask

How is data stored in Apache Kafka?

Kafka stores all the messages with the same key into a single partition. Each new message in the partition gets an Id which is one more than the previous Id number. This Id number is also called the Offset. So, the first message is at 'offset' 0, the second message is at offset 1 and so on.

Does Kafka store data in memory or disk?

Kafka always writes directly to disk, but remember one thing the I/O operations are really carried out by the Operating System. In case of Linux it seems the data is written to the page cache until it can be written to the disk.

Where are the Kafka topics stored?

By default on Linux it is stored in /tmp/kafka-logs . If you will navigate to this folder you will see something like this: recovery-point-offset-checkpoint. replication-offset-checkpoint.

What are broker nodes in Kafka?

A Kafka broker receives messages from producers and stores them on disk keyed by unique offset. A Kafka broker allows consumers to fetch messages by topic, partition and offset. Kafka brokers can create a Kafka cluster by sharing information between each other directly or indirectly using Zookeeper.

What is @broker in Apache Kafka?

Broker is a server / node in Apache Kafka. That means Apache Kafka cluster is composed of multiple brokers. Each Broker in Cluster identified by unique ID ( Integer ). Each Broker contains certain partitions of a topic. When we specify number of partition at the time of Topic creation data is spread to Brokers available in the clusters.

What is Apache Kafka and how does it work?

Apache Kafka is more than just a better message broker. The framework implementation has features that give it database capabilities. It’s now replacing the relational databases as the definitive record for events in businesses.

What is the difference between broker and partition in Kafka?

Data is stored in Partition for a limited time and it is immutable and it can’t be changed. Broker is a server / node in Apache Kafka. That means Apache Kafka cluster is composed of multiple brokers. Each Broker in Cluster identified by unique ID ( Integer ). Each Broker contains certain partitions of a topic.

How do I get Kafka logs from the broker?

log.dir in server.properties is the place where the Kafka broker will store the commit logs containing your data. Typically this will your high speed mount disk for mission critical use-cases. For application/broker logging you can use general log4j logging to get the event logging in your custom location. Below are the variables to do this.


1 Answers

Kafka topics are "distributed and partitioned append only logs". Parameter log.dir defines where topics (ie, data) is stored.

It is not related to application/broker logging.

The default log.dir is /tmp/kafka-logs which you may want to change in case your OS has a /tmp directory cleaner.

like image 99
Matthias J. Sax Avatar answered Oct 21 '22 07:10

Matthias J. Sax