Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache ZooKeeper: How do writes work

Tags:

Apache ZooKeeper is a kind of high available data-store for small objects. A ZooKeeper cluster consists of some nodes which all keep the whole dataset in their memory. The dataset is called "always-consistent", so every node has the same data at every time.

According to the documentation and blog posts, every node in the cluster can answer reads and accept writes.

  • Reads are always answered locally by the node, so no communication with the cluster is involved.
  • Writes are forwarded to a designated "Leader" node, which forwards the write-request to all nodes and waits for their replies. If at least half of the nodes answers, the write is considered successful.

Question: Why is it enough for the leader to wait for half of the nodes to reply? If somebody connects to one of the nodes which didn't receive the update, he gets an outdated result (only local read to local value).

like image 968
theomega Avatar asked Mar 24 '11 13:03

theomega


People also ask

How do you write ZooKeeper?

Examples of zookeeper One of the blogs belongs to the gorilla zookeeper.

How does Apache ZooKeeper work?

ZooKeeper is simple. ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchal namespace which is organized similarly to a standard file system. The name space consists of data registers - called znodes, in ZooKeeper parlance - and these are similar to files and directories.

What are Znodes in ZooKeeper?

In the ZooKeeper documentatin, znodes refer to the data nodes. Servers to refer to machines that make up the ZooKeeper service; quorum peers refer to the servers that make up an ensemble; client refers to any host or process which uses a ZooKeeper service. Znodes are the main enitity that a programmer access.


1 Answers

In order to achieve high read-availability, Zookeeper guarantees a weak-consistency over the replicates: a read can always be answered by a client node, and the answer returned may be a stale value (even a new version has been committed through the leader).

Then it is the users' responsibility to decide whether the answer for a read is "stale-able" or not, since not all applications require the up-to-date information. So the following choices are provided:

1) If your application does not need up-to-date values for reads, you can get high read-availability by requesting data directly from the client.

2) If your application requires up-to-date values for reads, you should use the "sync" API before your read request to sync the client-side version with the leader.

So as a conclusion, Zookeeper provides a customizable consistency guarantee, and users can decide the balance between availability and consistency.

If you want to know more about the internals of Zookeeper, I recommend this paper: ZooKeeper: Wait-free coordination for Internet-scale systems. The above strategy is described in Section 4.4.

like image 195
asksw0rder Avatar answered Oct 02 '22 09:10

asksw0rder