Because of many error I can't figure it out why it's happening in not connecting datanode slave vm into my master vm. Any suggestion is welcome, so i can try it. And to start, one of them is this error in my slave vm log:
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000
Because of this, I can't run the job that I want in my master vm:
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5
which give me this error
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/ubuntu/QuasiMonteCarlo_1386793331690_1605707775/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
and even so, the hdfs dfsadmin -report
(at master vm) gives me all 0
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Datanodes available: 0 (0 total, 0 dead)
For that, I build up on openstack 3 vms ubuntu, one for master and others slaves.
in master, it's build up in etc/hosts
127.0.0.1 localhost
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8
50.50.1.8 slave1
50.50.1.4 slave2
core-site.xml
<name>fs.default.name</name>
<value>hdfs://ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000</value>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop-2.2.0/tmp</value>
hdfs-site.xml
<name>dfs.replication</name>
<value>3</value>
<name>dfs.namenode.name.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode</value>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode</value>
<name>dfs.permissions</name>
<value>false</value>
mapred-site.xml
<name>mapreduce.framework.name</name>
<value>yarn</value>
And my slave vm file contains each line: slave1 and slave2.
All the logs from master vm contains no error, but when I use slave vm, it gives that error to connect. and the nodemanager gives me error too inside the log:
Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76/50.50.1.8 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused;
From my Slave Machine: core-site.xml
<name>fs.default.name</name>
<value>hdfs://ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000</value>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop-2.2.0/tmp</value>
hdfs-site.xml
<name>dfs.namenode.name.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode</value>
and on my /etc/hosts
127.0.0.1 localhost
50.50.1.8 ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8
The JPS master
15863 ResourceManager
15205 SecondaryNameNode
14967 NameNode
16194 Jps
slave
1988 Jps
1365 DataNode
1894 NodeManager
The cause all of the error showing, this below error is the main reason not been able to master connect to slave:
Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76/50.50.1.8 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused;
Basically, 0.0.0.0:8031
is the port of yarn.resourcemanager.resource-tracker.address
, so I checked using lsof -i :8031, the port wasn't enable/open/allowed. Since I'm using OpenStack(cloud), added 8031 and other ports that was showing error and voilá, worked as intend.
I struggled a lot, finally got after using "systemctl stop firewalld" before this I also disabled selinux and ipv6.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With