Why Kafka is not P in CAP theorem

Tags:

apache-kafka

The main developer of Kafka said Kafka is CA but P in CAP theorem. But I'm so confused, is Kafka not Partition tolerate? I think it does, when one replication is down the other would become leader and continue work!

Also, I would like to know what if Kafka uses P? Would P hurt C or A?

568

asked Jul 17 '18 07:07

Jack

2 Answers

CAP Theorem states that any distributed system can provide at most two out of the three guarantees: Consistency, Availability and Partition tolerance.

According to the Engineers at LinkedIn (where Kafka was initially founded) Kafka is a CA system:

All distributed systems must make trade-offs between guaranteeing consistency, availability, and partition tolerance (CAP Theorem). Our goal was to support replication in a Kafka cluster within a single datacenter, where network partitioning is rare, so our design focuses on maintaining highly available and strongly consistent replicas. Strong consistency means that all replicas are byte-to-byte identical, which simplifies the job of an application developer.

However, I would say that it depends on your configuration and more precisely on the variables acks, min.insync.replicas and replication.factor. According to the docs,

If a topic is configured with only two replicas and one fails (i.e., only one in sync replica remains), then writes that specify acks=all will succeed. However, these writes could be lost if the remaining replica also fails. Although this ensures maximum availability of the partition, this behavior may be undesirable to some users who prefer durability over availability. Therefore, we provide two topic-level configurations that can be used to prefer message durability over availability:

Disable unclean leader election - if all replicas become unavailable, then the partition will remain unavailable until the most recent leader becomes available again. This effectively prefers unavailability over the risk of message loss. See the previous section on Unclean Leader Election for clarification.

Specify a minimum ISR size - the partition will only accept writes if the size of the ISR is above a certain minimum, in order to prevent the loss of messages that were written to just a single replica, which subsequently becomes unavailable. This setting only takes effect if the producer uses acks=all and guarantees that the message will be acknowledged by at least this many in-sync replicas. This setting offers a trade-off between consistency and availability. A higher setting for minimum ISR size guarantees better consistency since the message is guaranteed to be written to more replicas which reduces the probability that it will be lost. However, it reduces availability since the partition will be unavailable for writes if the number of in-sync replicas drops below the minimum threshold.

195

answered Oct 25 '22 09:10

Giorgos Myrianthous

If you read how CAP defines C, A and P, "CA but not P" just means that when an arbitrary network partition happens, each Kafka topic-partition will either stop serving requests (lose A), or lose some data (lose C), or both, depending on its settings and partition's specifics.

If a network partition splits all ISRs from Zookeeper, with default configuration unclean.leader.election.enable = false, no replicas can be elected as a leader (lose A).

If at least one ISR can connect, it will be elected, so it can still serve requests (preserve A). But with default min.insync.replicas = 1 an ISR can lag behind the leader by approximately replica.lag.time.max.ms = 10000. So by electing it Kafka potentially throws away writes confirmed to producers by the ex-leader (lose C).

Kafka can preserve both A and C for some limited partitions. E.g. you have min.insync.replicas = 2 and replication.factor = 3, and all 3 replicas are in-sync when a network partition happens, and it splits off at most 1 ISR (either a single-node failures, or a single-DC failure or a single cross-DC link failure).

To preserve C for arbitrary partitions, you have to set min.insync.replicas = replication.factor. This way, no matter which ISR is elected, it is guaranteed to have the latest data. But at the same time it won't be able to serve write requests until the partition heals (lose A).

answered Oct 25 '22 09:10

Alexander Abramov

Related questions
                            
                                Smooth vue collapse transition on v-if
                            
                                How does minReadySeconds affect readiness probe?
                            
                                Dotnet Unit test with Coverlet- How to get coverage for entire solution and not just a project
                            
                                A basic Monoid definition gives "No instance for (Semigroup MyMonoid) arising from the superclasses of an instance declaration"
                            
                                Should a component's render method have return type React.ReactNode or JSX.Element?
                            
                                BeginTransaction with IsolationLevel in EF Core
                            
                                TryGetValue pattern with C# 8 nullable reference types
                            
                                Async/Await with Vuex dispatch
                            
                                Gradle duplicate entry error: META-INF/MANIFEST.MF (Or how to delete a file from jar)
                            
                                How to import existing VPC in aws cdk?
                            
                                How can you quickly compute the integer logarithm for any base?
                            
                                Is there a way to include a fragment identifier when using Asp.Net MVC ActionLink, RedirectToAction, etc.?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With