Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does commit-log mean in Kafka?

Tags:

apache-kafka

Forgive me I am just learning the Kafka. I have encountered a word named commit-log many times when I was reading the material of Kafka. but still have no idea of what exactly it is. the mentioned link like below.

https://kafka.apache.org/documentation/#uses_commitlog

Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data.

https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

One of the most useful things I learned in all this was that many of the things we were building had a very simple concept at their heart: the log. Sometimes called write-ahead logs or commit logs or transaction logs,

https://kafka.apache.org/protocol.html#protocol_partitioning

Kafka is a partitioned system so not all servers have the complete data set. Instead recall that topics are split into a pre-defined number of partitions, P, and each partition is replicated with some replication factor, N. Topic partitions themselves are just ordered "commit logs" numbered 0, 1, ..., P.

What does commit-log means ? Any difference with the concept from DBMS? How to understand it ? Thanks.

like image 667
Joe.wang Avatar asked Jul 17 '17 08:07

Joe.wang


People also ask

What does commit log do?

A commit log is a record of transactions. It's used to keep track of what's happening, and help with e.g. disaster recovery - generally, all commits are written to the log before being applied, so transactions that were in flight when the server went down can be recovered and re-applied by checking the log.

Where are the Kafka logs?

Logs location Apache Kafka logs in the cluster are located at /var/log/kafka .

What is commit log in postgresql?

In Postgres, commit logs are called write ahead logs. Each write to a Postgres database must first be recorded in the write ahead log before the data is changed in either a table or an index. The first benefit is that it speeds up database writes. Writing to a commit log is relatively fast, even on disk.

Is Kafka used for logging?

Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data. The log compaction feature in Kafka helps support this usage. In this usage Kafka is similar to Apache BookKeeper project.


1 Answers

Conceptually there's no difference between the "commit log" that Kafka provides and the commit log/transaction log/write ahead log that a DBMS uses: They're both about recording the changes made to something so that it can be replayed later.

In the case of a DBMS this replay will happen if the DB was not shut down cleanly and is necessary to ensure the DB resumes service in a consistent state. Importantly, in a DB this commit log is an implementation detail of the database and is not a concern of the database clients.

In a Kafka application this commit log is a first class concept. Subscribers to a topic can reconstruct the state of the application for themselves, if they want to (in effect, "replaying the log"). They can also react to particular events in the topic, and understand how a particular state was arrived at, neither of which is easy with a traditional DBMS.

like image 137
Tom Bentley Avatar answered Sep 28 '22 02:09

Tom Bentley