Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Kafka is not P in CAP theorem

Tags:

apache-kafka

The main developer of Kafka said Kafka is CA but P in CAP theorem. But I'm so confused, is Kafka not Partition tolerate? I think it does, when one replication is down the other would become leader and continue work!

Also, I would like to know what if Kafka uses P? Would P hurt C or A?

like image 568
Jack Avatar asked Jul 17 '18 07:07

Jack


People also ask

What does P in CAP theorem mean?

In CAP theorem, C stands for Consistency, A stands for Availability and P stands for Partition tolerance. Consistency: Every read receives the most recent writes or an error.

Is Kafka consistent or available?

The good news is that Apache Kafka is able to achieve both consistency and completeness for distributed processing of streaming data (these two objectives are part of the larger goal of correctness for database systems).

Is not the property in CAP theorem?

The CAP theorem states that it is not possible to guarantee all three of the desirable properties – consistency, availability, and partition tolerance at the same time in a distributed system with data replication.


2 Answers

CAP Theorem states that any distributed system can provide at most two out of the three guarantees: Consistency, Availability and Partition tolerance.

According to the Engineers at LinkedIn (where Kafka was initially founded) Kafka is a CA system:

All distributed systems must make trade-offs between guaranteeing consistency, availability, and partition tolerance (CAP Theorem). Our goal was to support replication in a Kafka cluster within a single datacenter, where network partitioning is rare, so our design focuses on maintaining highly available and strongly consistent replicas. Strong consistency means that all replicas are byte-to-byte identical, which simplifies the job of an application developer.

However, I would say that it depends on your configuration and more precisely on the variables acks, min.insync.replicas and replication.factor. According to the docs,

If a topic is configured with only two replicas and one fails (i.e., only one in sync replica remains), then writes that specify acks=all will succeed. However, these writes could be lost if the remaining replica also fails. Although this ensures maximum availability of the partition, this behavior may be undesirable to some users who prefer durability over availability. Therefore, we provide two topic-level configurations that can be used to prefer message durability over availability:

  1. Disable unclean leader election - if all replicas become unavailable, then the partition will remain unavailable until the most recent leader becomes available again. This effectively prefers unavailability over the risk of message loss. See the previous section on Unclean Leader Election for clarification.

  2. Specify a minimum ISR size - the partition will only accept writes if the size of the ISR is above a certain minimum, in order to prevent the loss of messages that were written to just a single replica, which subsequently becomes unavailable. This setting only takes effect if the producer uses acks=all and guarantees that the message will be acknowledged by at least this many in-sync replicas. This setting offers a trade-off between consistency and availability. A higher setting for minimum ISR size guarantees better consistency since the message is guaranteed to be written to more replicas which reduces the probability that it will be lost. However, it reduces availability since the partition will be unavailable for writes if the number of in-sync replicas drops below the minimum threshold.

like image 195
Giorgos Myrianthous Avatar answered Oct 25 '22 09:10

Giorgos Myrianthous


If you read how CAP defines C, A and P, "CA but not P" just means that when an arbitrary network partition happens, each Kafka topic-partition will either stop serving requests (lose A), or lose some data (lose C), or both, depending on its settings and partition's specifics.

If a network partition splits all ISRs from Zookeeper, with default configuration unclean.leader.election.enable = false, no replicas can be elected as a leader (lose A).

If at least one ISR can connect, it will be elected, so it can still serve requests (preserve A). But with default min.insync.replicas = 1 an ISR can lag behind the leader by approximately replica.lag.time.max.ms = 10000. So by electing it Kafka potentially throws away writes confirmed to producers by the ex-leader (lose C).

Kafka can preserve both A and C for some limited partitions. E.g. you have min.insync.replicas = 2 and replication.factor = 3, and all 3 replicas are in-sync when a network partition happens, and it splits off at most 1 ISR (either a single-node failures, or a single-DC failure or a single cross-DC link failure).

To preserve C for arbitrary partitions, you have to set min.insync.replicas = replication.factor. This way, no matter which ISR is elected, it is guaranteed to have the latest data. But at the same time it won't be able to serve write requests until the partition heals (lose A).

like image 40
Alexander Abramov Avatar answered Oct 25 '22 09:10

Alexander Abramov