I want to understand log.flush.interval.messages
setting in kafka broker.
The number of messages written to a log partition before we force an fsync on the log
Does it mean when it reaches the specified number of messages then it will write to another file in the disk? If so then:
At the same time
A message is only exposed to the consumers after it is flushed to Disk from segment file(http://notes.stephenholiday.com/Kafka.pdf)
Then consumer always reads from disk as it cant read from segment file?
What is the difference between storing in a segment file and on a disk?
The first thing I want to warn you about is that that Kafka paper is a little bit outdated regarding how all of this works since at that time Kafka did not have replication. I suggest you to read (if not already did it) about this in the Replication Section of Kafka documentation.
As the paper mentions each arriving message is written to a segment file. But you have to recall that when you write to a file the data is not transferred to the disk device immediately, it is buffered first. The way to force this write to happen is by calling the fsync system call (see man fsync) and this is where "log.flush.interval.messages" and "log.flush.interval.ms" come into play. With these settings you can tell Kafka exactly when to do this flush (after certain number of messages or period of time). But please note that Kafka, in general, recommends you not to set these and use replication for durability and allow the operating system's background flush capabilities as it is more efficient (see Broker configs in Kafka documentation).
For the second part of your question, as it is mentioned in the Replication Section of Kafka documentation, only committed messages (a message is considered "committed" when all in sync replicas for that partition have applied it to their log) are ever given out to the consumer. This is to avoid consumers potentially seeing a message that could be lost (because it was not fsynced to disk yet) if the leader fails.
@user1870400
Both log.flush.interval.ms
and log.flush.interval.messages
are set to Max. It make Kafka flush log to disk (eg. fsync
in linux) only depends on file system.
So, even you have set ack to 'all', none of follower relica (and leader itselft) ensures the log it fetch from leader has flush to disk. And if all the replicas crash before flushing, the log will lost.
The reason why Kafka chooose such an 'unsafe' choice is because, just like the paper said:
Kafka avoid explicitly caching messages in memory at the Kafka layer.
Kafka rely on the underlying file system page cache.
This has the main benefit of avoiding double buffering---messages are only cached in the page cache.
This has the additional benefit of retaining warm cache even when a broker process is restarted.
In order to make better use of file system cache, kafka set both flush interval to max by default. If you want to get rid of lost message even N broker are crash, set topic-level config flush.messages
or broker-level config log.flush.interval.messages
to 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With