I've come across this when describing the Zookeeper configuration for HBase, and I'm unfamiliar with the term. Does the 'N' have anything to do with the number of nodes in my HBase cluster? Or the number of nodes I should use in my Zookeeper cluster?
2f+1 refers to the level of reliability/availablility you require, in general it is not related to performance.
ZooKeeper ensembles (serving clusters) are made up of one or more servers which "vote" on each change. A majority of the original servers are required to "approve" any change before it's accepted. Clients (hbase in this case) connect to the ensemble and use it to coordinate. If the ensemble is up the clients can do this, if the ensemble is down then hbase is unable to use the service.
Say you have 3 servers (f=1) in the ensemble, if one fails the service is still up (2 is a majority). However if a second server fails the service would be down.
Say you have 5 servers (f=2) in the ensemble. In this case two servers can fail (3 is a majority) and the service is still up.
Typically 3 servers is more than sufficient. However for online production serving environments I'd suggest 5. Why? Say you take 1 server down for scheduled maintenance. If you have 5 servers you can stay up even if one of the remaining active servers fails unexpectedly.
Why not have 101 servers then? -- TANSTAAFL. See the graph here. ZK is a quorum based service. As the number of servers increases the write performance actually drops. More servers are required to participate in the quroum process (voting). As a result the write ops/sec decreases. (read is uneffected though).
n
refers to the number of failures that the system can experience but still be able to function with at least a majority of nodes. Two examples:
n = 1
- one node can fail out of a total of 2n+1 = 3
nodes
n = 2
- two nodes can fail out of a total of 2n+1 = 5
nodes
And so on!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With