Lets say the partition has 4 replicas (1 leader, 3 followers) and all are currently in sync. min.insync.replicas
is set to 3 and request.required.acks
is set to all or -1.
The producer send a message to the leader, the leader appends it to it's log. After that, two of the replicas crashed before they could fetch this message. One remaining replica successfully fetched the message and appended to it's own log.
The leader, after certain timeout, will send an error (NotEnoughReplicas, I think) to the producer since min.insync.replicas condition is not met.
My question is: what will happen to the message which was appended to leader and one of the replica's log?
Will it be delivered to the consumers when crashed replicas come back online and broker starts accepting and committing new messages (i.e. high watermark is forwarded in the log)?
There are different variables that could cause data loss in Kafka including data offsets, consumer auto-commit configuration, producer acknowledgements, replication, etc.
At-Least-Once Delivery in Apache Kafka At-least-once delivery requires the producer to maintain an extra state about message status and to resend failed messages. This means that at-least-once delivery sacrifices some performance in exchange for the guarantee that all messages will be delivered.
The Kafka cluster retains all published messages—whether or not they have been consumed—for a configurable period of time. For example if the log retention is set to two days, then for the two days after a message is published it is available for consumption, after which it will be discarded to free up space.
Kafka provides "at least once" delivery semantics. This means that a message that is sent may delivered one or more times. What people really want is "exactly once" semantics whereby duplicate messages are not delivered.
If there is no min.insync.replicas available and producer uses ack=all, then the message is not committed and consumers will not receive that message, even after crashed replicas come back and are added to the ISR list again. You can test this in the following way.
Start two brokers with min.insync.replicas = 2
$ ./bin/kafka-server-start.sh ./config/server-1.properties
$ ./bin/kafka-server-start.sh ./config/server-2.properties
Create a topic with 1 partition and RF=2. Make sure both brokers are in the ISR list.
$ ./bin/kafka-topics.sh --zookeeper zookeeper-1 --create --topic topic1 --partitions 1 --replication-factor 2
Created topic "topic1".
$ ./bin/kafka-topics.sh --zookeeper zookeeper-1 --describe --topic topic1
Topic:topic1 PartitionCount:1 ReplicationFactor:2 Configs:
Topic: topic1 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Run console consumer and console producer. Make sure produce uses ack=-1
$ ./bin/kafka-console-consumer.sh --new-consumer --bootstrap-server kafka-1:9092,kafka-2:9092 --topic topic1
$ ./bin/kafka-console-producer.sh --broker-list kafka-1:9092,kafka-2:9092 --topic topic1 --request-required-acks -1
Produce some messages. Consumer should receive them.
Kill one of the brokers (I killed broker with id=2). Check that ISR list is reduced to one broker.
$ ./bin/kafka-topics.sh --zookeeper zookeeper-1 --describe --topic topic1
Topic:topic1 PartitionCount:1 ReplicationFactor:2 Configs:
Topic: topic1 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1
Try to produce again. In the producer you should get some
Error: NOT_ENOUGH_REPLICAS
(one per retry) and finally
Messages are rejected since there are fewer in-sync replicas than required.
Consumer will not receive these messages.
Restart the killed broker and try to produce again. Consumer will receive these message but not those that you sent while one of the replicas was down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With