Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Datanode process not running in Hadoop

I set up and configured a multi-node Hadoop cluster using this tutorial.

When I type in the start-all.sh command, it shows all the processes initializing properly as follows:

starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-namenode-jawwadtest1.out jawwadtest1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest1.out jawwadtest2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest2.out jawwadtest1: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-secondarynamenode-jawwadtest1.out starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-jobtracker-jawwadtest1.out jawwadtest1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest1.out jawwadtest2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest2.out 

However, when I type the jps command, I get the following output:

31057 NameNode 4001 RunJar 6182 RunJar 31328 SecondaryNameNode 31411 JobTracker 32119 Jps 31560 TaskTracker 

As you can see, there's no datanode process running. I tried configuring a single-node cluster but got the same problem. Would anyone have any idea what could be going wrong here? Are there any configuration files that are not mentioned in the tutorial or I may have looked over? I am new to Hadoop and am kinda lost and any help would be greatly appreciated.

EDIT: hadoop-root-datanode-jawwadtest1.log:

STARTUP_MSG:   args = [] STARTUP_MSG:   version = 1.0.3 STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/$ ************************************************************/ 2012-08-09 23:07:30,717 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loa$ 2012-08-09 23:07:30,734 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:30,735 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:30,736 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:31,018 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:31,024 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:32,366 INFO org.apache.hadoop.ipc.Client: Retrying connect to $ 2012-08-09 23:07:37,949 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: $         at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(Data$         at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransition$         at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNo$         at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java$         at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNod$         at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode($         at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataN$         at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.$         at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1$  2012-08-09 23:07:37,951 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: S$ /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at jawwadtest1/198.101.220.90 ************************************************************/ 
like image 796
Jawwad Zakaria Avatar asked Aug 09 '12 17:08

Jawwad Zakaria


People also ask

How can I tell if DataNode is running in Hadoop?

Your answer To check Hadoop daemons are running or not, what you can do is just run the jps command in the shell. You just have to type 'jps' (make sure JDK is installed in your system). It lists all the running java processes and will list out the Hadoop daemons that are running.

What happens when a DataNode fails in Hadoop?

As soon as the datanodes are declared dead. Data blocks on the failed Datanode are replicated on other Datanodes based on the specified replication factor in hdfs-site. xml file. Once the failed datanodes comes back the Name node will manage the replication factor again.

How do I manually start my DataNode?

Datanode daemon should be started manually using $HADOOP_HOME/bin/hadoop-daemon.sh script. Master (NameNode) should correspondingly join the cluster after automatically contacted. New node should be added to the configuration/slaves file in the master server. New node will be identified by script-based commands.

What happens when DataNode is down?

If Namenode gets down then the whole Hadoop cluster is inaccessible and considered dead. Datanode stores actual data and works as instructed by Namenode. A Hadoop file system can have multiple data nodes but only one active Namenode.


2 Answers

You need to do something like this:

  • bin/stop-all.sh (or stop-dfs.sh and stop-yarn.sh in the 2.x serie)
  • rm -Rf /app/tmp/hadoop-your-username/*
  • bin/hadoop namenode -format (or hdfs in the 2.x serie)

the solution was taken from: http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/. Basically it consists in restarting from scratch, so make sure you won't loose data by formating the hdfs.

like image 63
giltsl Avatar answered Oct 05 '22 20:10

giltsl


I ran into the same issue. I have created a hdfs folder '/home/username/hdfs' with sub-directories name, data, and tmp which were referenced in config xml files of hadoop/conf.

When I started hadoop and did jps, I couldn't find datanode so I tried to manually start datanode using bin/hadoop datanode. Then I realized from error message that it has permissions issue accessing the dfs.data.dir=/home/username/hdfs/data/ which was referenced in one of the hadoop config files. All I had to do was stop hadoop, delete the contents of /home/username/hdfs/tmp/* directory and then try this command - chmod -R 755 /home/username/hdfs/ and then start hadoop. I could find the datanode!

like image 35
sunskin Avatar answered Oct 05 '22 21:10

sunskin