Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does 2n + 1 quorum mean?

I've come across this when describing the Zookeeper configuration for HBase, and I'm unfamiliar with the term. Does the 'N' have anything to do with the number of nodes in my HBase cluster? Or the number of nodes I should use in my Zookeeper cluster?

like image 546
Big Bird Avatar asked Nov 19 '10 18:11

Big Bird


2 Answers

2f+1 refers to the level of reliability/availablility you require, in general it is not related to performance.

ZooKeeper ensembles (serving clusters) are made up of one or more servers which "vote" on each change. A majority of the original servers are required to "approve" any change before it's accepted. Clients (hbase in this case) connect to the ensemble and use it to coordinate. If the ensemble is up the clients can do this, if the ensemble is down then hbase is unable to use the service.

Say you have 3 servers (f=1) in the ensemble, if one fails the service is still up (2 is a majority). However if a second server fails the service would be down.

Say you have 5 servers (f=2) in the ensemble. In this case two servers can fail (3 is a majority) and the service is still up.

Typically 3 servers is more than sufficient. However for online production serving environments I'd suggest 5. Why? Say you take 1 server down for scheduled maintenance. If you have 5 servers you can stay up even if one of the remaining active servers fails unexpectedly.

Why not have 101 servers then? -- TANSTAAFL. See the graph here. ZK is a quorum based service. As the number of servers increases the write performance actually drops. More servers are required to participate in the quroum process (voting). As a result the write ops/sec decreases. (read is uneffected though).

like image 161
phunt Avatar answered Nov 03 '22 22:11

phunt


n refers to the number of failures that the system can experience but still be able to function with at least a majority of nodes. Two examples:

n = 1 - one node can fail out of a total of 2n+1 = 3 nodes

n = 2 - two nodes can fail out of a total of 2n+1 = 5 nodes

And so on!

like image 36
Chris Bunch Avatar answered Nov 03 '22 21:11

Chris Bunch