Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing Bad Node in Zookeeper Quorum Safely

We have 5 node zookeeper quorum ( A,B,C,D,E ) running in production, 1 node went down last week( E ) . quorum is healthy but we need to replace ( E ) with new healthy node ( F )

I am juggling between 2 options

1. add ( F ) to the quorum and then remove  ( E )
2. replace ( F ) with ( E ) restart followers and then restart leader

I tested Option #2, I can see that ( F ) is accepted in quorum after leadership is forced ( by restarting leader )

Quorum is healthy, but I just wanted to make sure if this is standard procedure

I dont find any apache documentation about node replacement for this version

ZK Version : 3.4.6
like image 652
DevOps_101 Avatar asked Sep 22 '17 04:09

DevOps_101


People also ask

How does a ZooKeeper quorum work?

By default, ZooKeeper uses majority quorums, which means that every voting that happens in one of these protocols requires a majority to vote on. One example is acknowledging a leader proposal: the leader can only commit once it receives an acknowledgement from a quorum of servers.

What is ephemeral node in ZooKeeper?

An ephemeral zNode is a node that will disappear when the session of its owner ends. A typical use case for ephemeral nodes is when using ZooKeeper for discovery of hosts in your distributed system (service discovery).


1 Answers

Yes, for versions prior to 3.5.*, reconfiguration of a ZK cluster requires coordinated restarts after ensuring the configuration is updated to replace the old node with the new one, so that the new node(s) could join the quorom and old one is removed. I had found this gist helpful.

In general, for upgrades also, it's recommended to go with rolling restarts - reference apache link.

If possible, I suggest you consider upgrading to 3.5* version wherein dynamic reconfiguration is possible without any restarts.

like image 126
Sachin Lala Avatar answered Oct 06 '22 11:10

Sachin Lala