Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop yarn node list shows slaves as localhost.localdomain:#somenumber. connection refuse exception

Tags:

hadoop

I have got connection refuse exception from localhost.localdomain/127.0.0.1 to localhost.localdomain:55352 when trying to run wordcount program. yarn node -list gives

hduser@localhost:/usr/local/hadoop/etc/hadoop$ yarn node -list
15/05/27 07:23:54 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.111.72:8040
Total Nodes:2
         Node-Id         Node-State Node-Http-Address   Number-of-Running-Containers
localhost.localdomain:32991         RUNNING localhost.localdomain:8042                             0
localhost.localdomain:55352         RUNNING localhost.localdomain:8042                             0

master /etc/hosts:

127.0.0.1    localhost localhost.localdomain localhost4 localhost4.localdomain4
#127.0.1.1    ubuntu-Standard-PC-i440FX-PIIX-1996
192.168.111.72  master
192.168.111.65  slave1
192.168.111.66  slave2

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

slave /etc/hosts:

127.0.0.1       localhost.localdomain localhost
#127.0.1.1      ubuntu-Standard-PC-i440FX-PIIX-1996
192.168.111.72  master
#192.168.111.65  slave1
#192.168.111.66  slave2

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

What I understood is master is wrongly trying to connect to slaves on localhost. Please help me resolve this. Any suggestion is appreciated. Thank you.

like image 236
rajesh Avatar asked Feb 10 '23 00:02

rajesh


1 Answers

Here is the code how NodeManager builds the NodeId:

private NodeId buildNodeId(InetSocketAddress connectAddress,
  String hostOverride) {
  if (hostOverride != null) {
    connectAddress = NetUtils.getConnectAddress(
      new InetSocketAddress(hostOverride, connectAddress.getPort()));
  }
  return NodeId.newInstance(
    connectAddress.getAddress().getCanonicalHostName(),
    connectAddress.getPort());
}

NodeManager tries to get the canonical hostname from the binding address, localhost will be gotten by given address 127.0.0.1.

So in your case, on the slave host, localhost.localdomain is the default host name for address 127.0.0.1, and the possible solution might be changing the first line of /etc/hosts on your slaves respectively to:

  127.0.0.1  slave1 localhost.localdomain localhost

and

  127.0.0.1  slave2 localhost.localdomain localhost
like image 183
Pin Zhang Avatar answered Feb 13 '23 07:02

Pin Zhang