Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum value for zookeeper.connection.timeout.ms

Right now we are running kafka in AWS EC2 servers and zookeeper is also running on separate EC2 instances.

We have created a service (system units ) for kafka and zookeeper to make sure that they are started in case the server gets rebooted.

The problem is sometimes zookeeper severs are little late in starting and kafka brokers by that time getting terminated.

So to deal with this issue we are planning to increase the zookeeper.connection.timeout.ms to some high number like 10 mins, at the broker side. Is this a good approach ?

Are there any size effect of increasing the zookeeper.connection.timeout.ms timeout in zookeeper ?

like image 371
Prabhakar D Avatar asked Sep 15 '25 05:09

Prabhakar D


1 Answers

Increasing zookeeper.connection.timeout.ms may or may not handle your problem in hand but there is a possibility that it will take longer time to detect a broker soft failure.

Couple of things you can do: 1) You must alter the System to launch the kafka to delay by 10 mins (the time you wanted to put in zookeper timeout). 2) We are using HDP cluster which automatically takes care of such scenarios.

Here is an explanation from Kafka FAQs: During a broker soft failure, e.g., a long GC, its session on ZooKeeper may timeout and hence be treated as failed. Upon detecting this situation, Kafka will migrate all the partition leaderships it currently hosts to other replicas. And once the broker resumes from the soft failure, it can only act as the follower replica of the partitions it originally leads.

To move the leadership back to the brokers, one can use the preferred-leader-election tool here. Also, in 0.8.2 a new feature will be added which periodically trigger this functionality (details here).

To reduce Zookeeper session expiration, either tune the GC or increase zookeeper.session.timeout.ms in the broker config.

https://cwiki.apache.org/confluence/display/KAFKA/FAQ

Hope this helps

like image 116
Amar Avatar answered Sep 17 '25 18:09

Amar