Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zookeeper - what will happen if I pass in a connection string only some of the nodes from the zk cluster (ensemble)?

I have a zookeeper cluster consisting of N nodes (which knows about each other). What if I pass only M < N of the nodes' addresses in zk client connection string? What will be the cluster's behavior?

In a more specific case, what if I pass host address of only 1 zk from the cluster? Is it possible then for the zk client to connect to other hosts from the cluster? What if this one host is down? Will be client able to connect to other zookeeper nodes in an ensemble?

The other question is, is it possible to limit client to use only specific nodes from the ensemble?

like image 899
rideronthestorm Avatar asked Jan 16 '17 18:01

rideronthestorm


People also ask

What is ZooKeeper connection string?

ZooKeeper(String connectString, int sessionTimeout, Watcher watcher) To create a ZooKeeper client object, the application needs to pass a connection string containing a comma separated list of host:port pairs, each corresponding to a ZooKeeper server.

How many servers can form an ensemble in ZooKeeper?

Three ZooKeeper servers is the minimum recommended size for an ensemble, and we also recommend that they run on separate machines.

What is ZooKeeper watch?

What is ZooKeeper Watches? In ZooKeeper, all of the read operations – getData(), getChildren(), and exists() – have the option of setting a watch as a side effect. Defining ZooKeeper Watches, when the data for which the watch was set changes, a watch event (one-time trigger), sent to the client which set the watch.

What is Pzxid in ZooKeeper?

ZooKeeper Stat Structure The zxid of the change that last modified this znode. pzxid. The zxid of the change that last modified children of this znode. ctime. The time in milliseconds from epoch when this znode was created.

How do clients connect to zookeeper?

Once a ZooKeeper ensemble starts, it will wait for the clients to connect. Clients will connect to one of the nodes in the ZooKeeper ensemble. It may be a leader or a follower node. Once a client is connected, the node assigns a session ID to the particular client and sends an acknowledgement to the client.

What is a follower node in Zookeeper?

It may be a leader or a follower node. Once a client is connected, the node assigns a session ID to the particular client and sends an acknowledgement to the client. If the client does not get an acknowledgment, it simply tries to connect another node in the ZooKeeper ensemble.

How do I limit access to ZooKeeper nodes in my cluster?

For security reasons you can limit access to the Apache ZooKeeper nodes that are part of your Amazon MSK cluster. To limit access to the nodes, you can assign a separate security group to them. You can then decide who gets access to that security group. Get the Apache ZooKeeper connection string for your cluster.

How do I get the Apache ZooKeeper connection string from my cluster?

Your Amazon MSK cluster must be in the ACTIVE state for you to be able to obtain the Apache ZooKeeper connection string. When a cluster is still in the CREATING state, the output of the describe-cluster command doesn't include ZookeeperConnectString.


1 Answers

What if I pass only M < N of the nodes' addresses in zk client connection string? What will be the cluster's behavior?

ZooKeeper clients will connect only to the M nodes specified in the connection string. The ZooKeeper ensemble's back-end interactions (leader election and processing write transaction proposals) will continue to be processed by all N nodes in the cluster. Any of the N nodes still could become the ensemble leader. If a ZooKeeper server receives a write transaction request, and that server is not the current leader, then it will forward the request to the current leader.

In a more specific case, what if I pass host address of only 1 zk from the cluster? Is it possible then for the zk client to connect to other hosts from the cluster? What if this one host is down? Will be client able to connect to other zookeeper nodes in an ensemble?

No, the client would only be able to connect to the single address specified in the connection string. That address effectively becomes a single point of failure for the application, because if the server goes down, the client will not have any other options for establishing a connection.

The other question is, is it possible to limit client to use only specific nodes from the ensemble?

Yes, you can limit the nodes that the client considers for establishing a connection by listing only those nodes in the client's connection string. However, keep in mind that any of the N nodes in the cluster could still become the leader, and then all client write requests will get forwarded to that leader. In that sense, the client is using the other nodes indirectly, but the client is not establishing a direct socket connection to those nodes.

The ZooKeeper Overview page in the Apache documentation has further discussion of client and server behavior in a ZooKeeper cluster. For example, there is a relevant quote in the Implementation section:

As part of the agreement protocol all write requests from clients are forwarded to a single server, called the leader. The rest of the ZooKeeper servers, called followers, receive message proposals from the leader and agree upon message delivery. The messaging layer takes care of replacing leaders on failures and syncing followers with leaders.

like image 78
Chris Nauroth Avatar answered Jan 02 '23 13:01

Chris Nauroth