Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to configure the time it takes for a kafka cluster to re-elect partition leaders after stopping and restarting a broker?

I have the following setup:
3 kafka brokers and a 3 zookeeper ensamble
1 topic with 12 partitions and 3 replicas (each kafka broker is thus the leader of 4 partitions)

I stop one of the brokers - it gets removed from the cluster, leadership of its partitions is moved to the two remaining brokers

I start the broker back - it reappears in the cluster, and eventually the leadership gets rebalanced so each broker is the leader of 4 partitions.

It works OK, except I find the time spent before the rebalancing too long (like minutes). This happens under no load - no messages are sent to the cluster, no messages are consumed.

Kafka version 0.9.0.0, zookeeper 3.4.6

zookeeper tickTime = 2000

kafka zookeeper.connection.timeout.ms = 6000

(basically the default config)

Does anyone know what config parameters in kafka and/or zookeeper influence the time taken for the leader rabalancing ?

like image 469
Jan Šourek Avatar asked Feb 02 '16 17:02

Jan Šourek


People also ask

How much time Kafka rebalancing takes?

kafka consumer rebalancing takes long time (from 3 secs to 5 minutes)

What happens when Kafka broker restarts?

Safe Broker Restarts It needs to check all of its recent logs to make sure that it doesn't have any incomplete messages. It will also start replicating partitions from where it left off when it was restarted. Keep checking kafka topics --describe until all topic-partitions have all brokers in the isr.

Which of the Kafka brokers is responsible for electing partition leaders?

Active controller In a Kafka cluster, one of the brokers serves as the controller, which is responsible for managing the states of partitions and replicas and for performing administrative tasks like reassigning partitions.

What happens when a leader broker of a partition fails in Kafka?

Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. If the leader fails, one of the followers will automatically become the new leader.


1 Answers

as said in the official documentation http://kafka.apache.org/documentation.html#configuration (More details about broker configuration can be found in the scala class kafka.server.KafkaConfig.) there actually is a leader.imbalance.check.interval.seconds property which defaults to 300 (5 minutes), setting it to 30 seconds does what I need.

like image 152
Jan Šourek Avatar answered Nov 03 '22 01:11

Jan Šourek