We used 2 interfaces for our hadoop cluster. Private eth-1 and public. It looks like when hadoop datanode starts, it picks public IP address instead of private. When I look at hadoop-cmf-hdfs-DATANODE-hostname.log.out, it shows up
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = hostname.public.net/208.x.x.x
where instead it should say
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = hostname-eth1.private.net/192.168.x.x
There is a setting in hdfs-site.xml, which can control the interface, that is used by the Data Node as its IP address.
dfs.datanode.dns.interface = The name of the Network Interface from which a data node should report its IP address.
This is set to "default". If you want to use eth1, then set this property in hdfs-site.xml as:
<property>
<name>dfs.datanode.dns.interface</name>
<value>eth1</value>
</property>
To quote from "Hadoop The Definitive Guide" book:
There is also a setting for controlling which network interfaces the datanodes use as their IP addresses (for HTTP and RPC servers). The relevant property is
dfs.datanode.dns.interface, which is set to default to use the default network
interface. You can set this explicitly to report the address of a particular interface (eth0, for example).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With