I'm struggling to setup a Hbase distributed cluster with 2 nodes, one is my machine and one is the VM, using the "host-only" Adapter in VirtualBox.
My problem is that the region server (from VM machine) can't connect to Hbase master running on host machine. Although in Hbase shell I can list, create table, ..., in regionserver on VM machine ('slave'), the log always show
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was:
java.net.ConnectException: Connection refused
Previously, I've successfully setup Hadoop, HDFS and MapReduce on this cluster with 2 nodes named as 'master', and 'slave', 'master' as master node and both 'master' and 'slave' work as slave nodes, these names bound to the vboxnet0 interface of VirtualBox (the hostnames in /etc/hostname is different). I'also specify the "slave.host.name" property for each node as 'master' and 'slave'.
It seems that the Hbase master on the 'master' always run with 'localhost' hostname, from the slave machine, I can't telnet to the hbase master with 'master' hostname. So is there any way to specify hostname use for Hbase master as 'master', I've tried specify some properties about DNS interface for ZooKeeper, Master, RegionServer to use internal interface between master and slave, but it still does not work at all.
/etc/hosts for both as something like
127.0.0.1 localhost
127.0.0.1 ubuntu.mymachine
# For Hadoop
192.168.56.1 master
192.168.56.101 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Start the cluster by running the start-hbase.sh command on node-1. ZooKeeper starts first, followed by the Master, then the RegionServers, and finally the backup Masters. Run the jps command on each node to verify that the correct processes are running on each server.
HMaster – The implementation of Master Server in HBase is HMaster. It is a process in which regions are assigned to region server as well as DDL (create, delete table) operations. It monitor all Region Server instances present in the cluster. In a distributed environment, Master runs several background threads.
In HBase the slaves are called Region Servers. Each Region Server is responsible to serve a set of regions, and one Region (i.e. range of rows) can be served only by one Region Server.
The answer that @Infinity provided seems to belong to version ~0.9.4.
For version 1.1.4.
according to the source code from
org.apache.hadoop.hbase.master.HMaster
the configuration should be:
<property>
<name>hbase.master.hostname</name>
<value>master.local</value>
<!-- master.local is the DNS name in my network pointing to hbase master -->
</property>
After setting this value, region servers are able to connect to hbase master; however, in my environment, the region server complained about:
com.google.protobuf.ServiceException: java.net.SocketException: Invalid argument
The problem disappeared after I installed oracle JDK 8 instead of open-jdk-7 in all of my nodes.
So in conclusion, here is my solution:
use dns name server instead of setting /etc/hosts, as hbase is very picky on hostname and seems requires DNS lookup as well as reverse DNS lookup.
upgrade jdk to oracle 8
use the setting item mentioned above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With