Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many is the minimum server composition of HBase?

How many is the minimum server composition of HBase?

Full-distributed, use sharding, but not use Hadoop. It's for production environment.

I'm looking forward to explain like this.

  • Server 1: Zookeeper

  • Server 2: Region server

    ... and more

Thank you.

like image 834
Takamario Avatar asked Feb 22 '12 16:02

Takamario


Video Answer


1 Answers

The minimum is one- see pseudo-distributed mode. The moving parts involved are:

Assuming that you are running on HDFS (which you should be doing):

  1. 1 HDFS NameNode
  2. 1 or more HDFS Secondary NameNode(s)
  3. 1 or more HDFS DataNode(s)

For MapReduce (if you want it):

  1. 1 MapReduce JobTracker
  2. 1 or more MapReduce TaskTracker(s) (Usually same machines as datanodes)

For HBase itself

  1. 1 or more HBase Master(s) (Hot backups are a good idea)
  2. 1 or more HBase RegionServer(s) (Usually same machines as datanodes)
  3. 1 or more Thrift Servers (if you need to access HBase from the outside the network it is on)

For ZooKeeper

  1. 3 - 5 ZooKeeper node(s)

The number of machines that you need is really dependent on how much reliability you need in the face of hardware failure and for what kind of nodes. The only node of the above that does not (yet) support hot failover or other recovery in the face of hardware failure is the HDFS NameNode, though that is being fixed in the more recent Hadoop releases.

You typically want to set the HDFS replication factor of your RegionServers to 3, so that you can take advantage of rack awareness.

So after that long diatribe, I'd suggest at a minimum (for a production deployment):

  • 1x HDFS NameNode
  • 1x JobTracker / Secondary NameNode
  • 3x ZK Nodes
  • 3x DataNode / RegionServer nodes (And if you want to run MapReduce, TaskTracker)
  • 1x Thrift Server (Only if accessing HBase from outside of the network it is running on)
like image 149
Chris Shain Avatar answered Oct 13 '22 18:10

Chris Shain