Kafka doc says:
- Kafka relies heavily on the filesystem for storing and caching messages.
- A modern operating system provides read-ahead and write-behind techniques that prefetch data in large block multiples and group smaller logical writes into large physical writes.
- Modern operating systems have become increasingly aggressive in their use of main memory for disk caching. A modern OS will happily divert all free memory to disk caching with little performance penalty when the memory is reclaimed. All disk reads and writes will go through this unified cache
- ...rather than maintain as much as possible in-memory and flush it all out to the filesystem in a panic when we run out of space, we invert that. All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk. In effect this just means that it is transferred into the kernel's pagecache.”
Further this article says:
(3) a message is ‘committed’ when all in sync replicas have applied it to their log, and (4) any committed message will not be lost, as long as at least one in sync replica is alive.
So even if I configure producer with acks=all
(which causes producer to receive acknowledgement after all brokers commit the message) and producer receives acknowledgement for certain message, does that mean their is still a possibility that the message can get lost, especially if all brokers goes down and the OS never flushes the committed message cache to disk?
There are different variables that could cause data loss in Kafka including data offsets, consumer auto-commit configuration, producer acknowledgements, replication, etc.
Once Kafka receives an acknowledgement, it changes the offset to the new value and updates it in the Zookeeper. Since offsets are maintained in the Zookeeper, the consumer can read next message correctly even during server outrages. This above flow will repeat until the consumer stops the request.
If there is no key provided, then Kafka will partition the data in a round-robin fashion. Usually, the key of a Kafka message is used to select the partition and the return value (of type int ) is the partition number. Without a key, you need to rely on the value which might be much more complex to process.
Ordering Guarantee with Apache Kafka “Apache Kafka preserves the order of messages within a partition. This means that if messages were sent from the producer in a specific order, the broker will write them to a partition in that order and all consumers will read them in that order.”
With acks=all
and if the replication factor of the topic is > 1, it's still possible to lose acknowledged messages but pretty unlikely.
For example, if you have 3 replicas (and all are in-sync), with acks=all
, you would need to lose all 3 brokers at the same time before any of them had the time to do the actual write to disk. With acks=all
, the aknowledgement is sent once all in-sync replicas have received the message, you can ensure this number stays high with min.insync.replicas=2
for example.
You can reduce the possibility of this scenario even further if you use the rack awareness feature (and obviously brokers are physically in different racks or even better data centers).
To summarize, using all these options, you can reduce the likeliness of losing data enough so that it's unlikely to ever happen.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With