I have a 4 node cluster (1 Namenode/Resource Manager 3 datanodes/node managers)
I am trying to run a simple tez example orderedWordCount
hadoop jar C:\HDP\tez-0.4.0.2.1.1.0-1621\tez-mapreduce-examples-0.4.0.2.1.1.0-1621.jar orderedwordcount sample/test.txt /sample/out
The job gets accepted ,the Application master and container gets setup but on the nodemanager I see these logs
2014-09-10 17:53:31,982 INFO [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2014-09-10 17:53:34,060 INFO [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
After configurable timeout the job fails
I searched for this problem and it always pointed to yarn.resourcemanager.scheduler.address configuration. In all my resource manager node and node managers I have this configuration defined correctly but for some reason its not getting picked up
<property>
<name>yarn.resourcemanager.hostname</name>
<value>10.234.225.69</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
It might be possible that your ResourceManager is listening on an IPv6 Port while your worker nodes (i.e NodeManagers) might be using IPv4 to connect to the ResourceManager
To quickly check if this is the case, do a
netstat -aln | grep 8030
If you get something similar to :::8030
, then your ResourceManager is indeed listening on an IPv6 Port. If its a IPv4 port, you should see something similar to 0.0.0.0:8030
To fix this, you might want to consider disabling IPv6 on all your machines and try once again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With