The Kafka controller in a Kafka cluster is in charge of managing partition leaders and replication. If there are 100 brokers in a Kafka cluster, is the controller just one Kafka broker? So out of the 100 brokers, is the controller the leader? How would you know which broker is the controller? Is the management of the Kafka Controller critical to Kafka system management?

The controller is one of the Kafka brokers that is also responsible for the task of electing partition leaders (in addition to the usual broker functionality). Is the controller just one broker? There is only 1 controller at a time. Going internally, each broker tries to create an ephemeral node in the zookeeper (/controller). The first one succeeds, becoming the controller. The others just get a proper exception ("node already exists"), and watch on the controller node. When the controller dies, the ephemeral node is removed, and the watching brokers are notified. And again, the first one among them which succeeds in registering the ephemeral node, becomes the new controller, the others will once again get the "node already exists" exception and keep on waiting. How would you know who is the controller in Kafka? When a new controller is elected, it gets a "controller epoch" number by zookeeper. The brokers know the current controller epoch and if they receive a message from a controller with an older number, they know to ignore it. Is the controller the leader? Not really.. Each partition has its own leader. When a broker dies, the controller goes over all the partitions that need a new leader, determines who the new leader should be (simply a random replica in the in-sync replica list aka ISRs of that partition) and sends a request to all the brokers that contain either the new leaders or the existing followers for those partitions. The new leaders now know that they need to start serving producer and consumer requests from clients, while the followers now know that they need to start replicating from the new leader.

Within a Kafka cluster, a single broker serves as the active controller which is responsible for state management of partitions and replicas. So in your case, if you have a cluster with 100 brokers, one of them will act as the controller. More details regarding the responsibilities of a cluster controller can be found here. In order to find which broker is the controller of a cluster you first need to connect to Zookeeper through ZK CLI: <pre class="prettyprint"><code>./bin/zkCli.sh -server localhost:2181 </code></pre> and then <code>get</code> the controller <pre class="prettyprint"><code>[zk: localhost:2181(CONNECTED) 0] get /controller </code></pre> The output should look like the one below: <pre class="prettyprint"><code>{"version":1,"brokerid":100,"timestamp":"1506423376977"} cZxid = 0x191 ctime = Tue Sep 26 12:56:16 CEST 2017 mZxid = 0x191 mtime = Tue Sep 26 12:56:16 CEST 2017 pZxid = 0x191 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x15ebdd241840002 dataLength = 56 numChildren = 0 </code></pre> <hr> Zookeeper is the storage of the state of a Kafka cluster. It is used for the controller election either in the very beginning or when the current controller crashes. The controller is also responsible for telling other replicas to become partition leaders when the partition leader broker of a topic fails/crashes.

The Kafka controller is brain of the Kafka cluster. It monitors the liveliness of the brokers and acts on broker failures. There will be only one Kafka controller in the cluster. The controller is one of the Kafka brokers in the cluster, in addition to usual broker functionality, is also responsible for electing partition leaders whenever existing brokers leaves the cluster or when a broker joins the cluster. The first broker that starts in the cluster will become the Kafka Controller by creating an ephemeral node called "/controller" in Zookeeper. When other brokers starts they also try to create this node in Zookeeper, but will receive an "node already exists" exception, by which they understands that there is already a Controller elected in the cluster. When the Zookeeper doesn't receive heartbeat messages from the Controller, the ephemeral node in Zookeeper will get deleted. It then notifies all the other brokers in the cluster that the Controller is gone via Zookeeper watcher, which starts a new election for new Controller again. All the other brokers will again try to create a ephemeral node "/controller" and the first one to succeed will be elected as the new Controller. There can be possibility of having more than one Controller in a cluster. Consider a case where a long GC (garbage collection) happened on the current Kafka Controller ("Controller_1") due to which Zookeeper didn't receive the heartbeat message from the Controller within the configured amount of time. This causes the "/controller" node being deleted from Zookeeper and another broker from the cluster gets elected as the new Controller ("Controller_2"). In this situation, we have 2 Controllers "Controller_1" and "Controller_2" in the cluster. "Controller_1" GC is finished and it may attempt to write/update the state in Zookeeper. "Controller_2" will also attempt to write/update the state in Zookeeper, which can lead to Kafka cluster being inconsistent with writes from both old Controller and new Controller. In order to avoid it, a new "epoch" is generated every time a Controller election takes place. Each time a controller is elected, it receives a new higher epoch through Zookeeper conditional increment operation. With this, When an old Controller ("Controller_1") attempts to update something, Zookeeper compares the current epoch with the older epoch sent by the old Controller in its write/update request and it simply ignores it. All the other brokers in the cluster also knows the current controller epoch and if they receive a message from old controller with an older epoch, they will ignore it as well.

How many Kafka controllers are there in a cluster and what is the purpose of a controller?

3 Answers

The controller is one of the Kafka brokers that is also responsible for the task of electing partition leaders (in addition to the usual broker functionality).

Is the controller just one broker?

There is only 1 controller at a time.

Going internally, each broker tries to create an ephemeral node in the zookeeper (/controller). The first one succeeds, becoming the controller. The others just get a proper exception ("node already exists"), and watch on the controller node. When the controller dies, the ephemeral node is removed, and the watching brokers are notified. And again, the first one among them which succeeds in registering the ephemeral node, becomes the new controller, the others will once again get the "node already exists" exception and keep on waiting.

How would you know who is the controller in Kafka?

When a new controller is elected, it gets a "controller epoch" number by zookeeper. The brokers know the current controller epoch and if they receive a message from a controller with an older number, they know to ignore it.

Is the controller the leader?

Not really.. Each partition has its own leader. When a broker dies, the controller goes over all the partitions that need a new leader, determines who the new leader should be (simply a random replica in the in-sync replica list aka ISRs of that partition) and sends a request to all the brokers that contain either the new leaders or the existing followers for those partitions.

The new leaders now know that they need to start serving producer and consumer requests from clients, while the followers now know that they need to start replicating from the new leader.

142

answered Oct 24 '22 08:10

Aaron_ab

Within a Kafka cluster, a single broker serves as the active controller which is responsible for state management of partitions and replicas. So in your case, if you have a cluster with 100 brokers, one of them will act as the controller.

More details regarding the responsibilities of a cluster controller can be found here.

In order to find which broker is the controller of a cluster you first need to connect to Zookeeper through ZK CLI:

./bin/zkCli.sh -server localhost:2181

and then get the controller

[zk: localhost:2181(CONNECTED) 0] get /controller

The output should look like the one below:

{"version":1,"brokerid":100,"timestamp":"1506423376977"}
cZxid = 0x191
ctime = Tue Sep 26 12:56:16 CEST 2017
mZxid = 0x191
mtime = Tue Sep 26 12:56:16 CEST 2017
pZxid = 0x191
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15ebdd241840002
dataLength = 56
numChildren = 0

Zookeeper is the storage of the state of a Kafka cluster. It is used for the controller election either in the very beginning or when the current controller crashes. The controller is also responsible for telling other replicas to become partition leaders when the partition leader broker of a topic fails/crashes.

answered Oct 24 '22 07:10

Giorgos Myrianthous

The Kafka controller is brain of the Kafka cluster. It monitors the liveliness of the brokers and acts on broker failures.

There will be only one Kafka controller in the cluster. The controller is one of the Kafka brokers in the cluster, in addition to usual broker functionality, is also responsible for electing partition leaders whenever existing brokers leaves the cluster or when a broker joins the cluster.

The first broker that starts in the cluster will become the Kafka Controller by creating an ephemeral node called "/controller" in Zookeeper. When other brokers starts they also try to create this node in Zookeeper, but will receive an "node already exists" exception, by which they understands that there is already a Controller elected in the cluster.

When the Zookeeper doesn't receive heartbeat messages from the Controller, the ephemeral node in Zookeeper will get deleted. It then notifies all the other brokers in the cluster that the Controller is gone via Zookeeper watcher, which starts a new election for new Controller again. All the other brokers will again try to create a ephemeral node "/controller" and the first one to succeed will be elected as the new Controller.

There can be possibility of having more than one Controller in a cluster. Consider a case where a long GC (garbage collection) happened on the current Kafka Controller ("Controller_1") due to which Zookeeper didn't receive the heartbeat message from the Controller within the configured amount of time. This causes the "/controller" node being deleted from Zookeeper and another broker from the cluster gets elected as the new Controller ("Controller_2").

In this situation, we have 2 Controllers "Controller_1" and "Controller_2" in the cluster. "Controller_1" GC is finished and it may attempt to write/update the state in Zookeeper. "Controller_2" will also attempt to write/update the state in Zookeeper, which can lead to Kafka cluster being inconsistent with writes from both old Controller and new Controller.

In order to avoid it, a new "epoch" is generated every time a Controller election takes place. Each time a controller is elected, it receives a new higher epoch through Zookeeper conditional increment operation.

With this, When an old Controller ("Controller_1") attempts to update something, Zookeeper compares the current epoch with the older epoch sent by the old Controller in its write/update request and it simply ignores it. All the other brokers in the cluster also knows the current controller epoch and if they receive a message from old controller with an older epoch, they will ignore it as well.

answered Oct 24 '22 09:10

Yathish Manjunath

Related questions
                            
                                How to get Kafka offsets for structured query for manual and reliable offset management?
                            
                                How to acknowledge current offset in spring kafka for manual commit
                            
                                How to delete multiple topics in Apache Kafka
                            
                                difference between exactly-once and at-least-once guarantees
                            
                                Kafka QuickStart, advertised.host.name gives kafka.common.LeaderNotAvailableException
                            
                                Counting Number of messages stored in a kafka topic
                            
                                Kafka console consumer: How to get only the last N messages from a topic instead of everything from the beginning?
                            
                                Can I have 100s of thousands of topics in a Kafka Cluster?
                            
                                Kafka Producer error Expiring 10 record(s) for TOPIC:XXXXXX: 6686 ms has passed since batch creation plus linger time
                            
                                KafKa partitioner class, assign message to partition within topic using key
                            
                                How to encode/decode Kafka messages using Avro binary encoder?
                            
                                Kafka consumer offset max value?
                            
                                Connect to Kafka running in Docker
                            
                                How to enable remote JMX on Kafka brokers (for JmxTool)?
                            
                                Get last message from kafka consumer console script
                            
                                Zookeeper: java.io.IOException: No snapshot found, but there are log entries. Something is broken
                            
                                Spring Kafka The class is not in the trusted packages
                            
                                Docker Kafka w/ Python consumer
                            
                                Failed to Read Artifact Descriptor: IntelliJ
                            
                                understanding consumer group id

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How many Kafka controllers are there in a cluster and what is the purpose of a controller?

Tags:

apache-kafka

kafka-cluster

김태우

People also ask

3 Answers

Aaron_ab

Giorgos Myrianthous

Yathish Manjunath

Recent Activity

Donate For Us