I set up and configured a multi-node Hadoop cluster using this tutorial. When I type in the start-all.sh command, it shows all the processes initializing properly as follows: <pre class="prettyprint"><code>starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-namenode-jawwadtest1.out jawwadtest1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest1.out jawwadtest2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest2.out jawwadtest1: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-secondarynamenode-jawwadtest1.out starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-jobtracker-jawwadtest1.out jawwadtest1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest1.out jawwadtest2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest2.out </code></pre> However, when I type the jps command, I get the following output: <pre class="prettyprint"><code>31057 NameNode 4001 RunJar 6182 RunJar 31328 SecondaryNameNode 31411 JobTracker 32119 Jps 31560 TaskTracker </code></pre> As you can see, there's no datanode process running. I tried configuring a single-node cluster but got the same problem. Would anyone have any idea what could be going wrong here? Are there any configuration files that are not mentioned in the tutorial or I may have looked over? I am new to Hadoop and am kinda lost and any help would be greatly appreciated. EDIT: hadoop-root-datanode-jawwadtest1.log: <pre class="prettyprint"><code>STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/$ ************************************************************/ 2012-08-09 23:07:30,717 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loa$ 2012-08-09 23:07:30,734 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:30,735 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:30,736 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:31,018 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:31,024 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:32,366 INFO org.apache.hadoop.ipc.Client: Retrying connect to $ 2012-08-09 23:07:37,949 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: $ at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(Data$ at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransition$ at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNo$ at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java$ at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNod$ at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode($ at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataN$ at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.$ at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1$ 2012-08-09 23:07:37,951 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: S$ /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at jawwadtest1/198.101.220.90 ************************************************************/ </code></pre>

You need to do something like this: <ul> <li> <code>bin/stop-all.sh</code> (or <code>stop-dfs.sh</code> and <code>stop-yarn.sh</code> in the 2.x serie)</li> <li><code>rm -Rf /app/tmp/hadoop-your-username/*</code></li> <li> <code>bin/hadoop namenode -format</code> (or <code>hdfs</code> in the 2.x serie)</li> </ul> the solution was taken from: http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/. Basically it consists in restarting from scratch, so make sure you won't loose data by formating the hdfs.

Datanode process not running in Hadoop

Tags:

process

configuration

hadoop

I set up and configured a multi-node Hadoop cluster using this tutorial.

When I type in the start-all.sh command, it shows all the processes initializing properly as follows:

starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-namenode-jawwadtest1.out jawwadtest1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest1.out jawwadtest2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-datanode-jawwadtest2.out jawwadtest1: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-secondarynamenode-jawwadtest1.out starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-jobtracker-jawwadtest1.out jawwadtest1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest1.out jawwadtest2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-tasktracker-jawwadtest2.out

However, when I type the jps command, I get the following output:

31057 NameNode 4001 RunJar 6182 RunJar 31328 SecondaryNameNode 31411 JobTracker 32119 Jps 31560 TaskTracker

As you can see, there's no datanode process running. I tried configuring a single-node cluster but got the same problem. Would anyone have any idea what could be going wrong here? Are there any configuration files that are not mentioned in the tutorial or I may have looked over? I am new to Hadoop and am kinda lost and any help would be greatly appreciated.

EDIT: hadoop-root-datanode-jawwadtest1.log:

STARTUP_MSG:   args = [] STARTUP_MSG:   version = 1.0.3 STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/$ ************************************************************/ 2012-08-09 23:07:30,717 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loa$ 2012-08-09 23:07:30,734 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:30,735 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:30,736 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:31,018 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapt$ 2012-08-09 23:07:31,024 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl:$ 2012-08-09 23:07:32,366 INFO org.apache.hadoop.ipc.Client: Retrying connect to $ 2012-08-09 23:07:37,949 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: $         at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(Data$         at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransition$         at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNo$         at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java$         at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNod$         at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode($         at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataN$         at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.$         at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1$  2012-08-09 23:07:37,951 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: S$ /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at jawwadtest1/198.101.220.90 ************************************************************/

796

asked Aug 09 '12 17:08

Jawwad Zakaria

2 Answers

You need to do something like this:

bin/stop-all.sh (or stop-dfs.sh and stop-yarn.sh in the 2.x serie)
rm -Rf /app/tmp/hadoop-your-username/*
bin/hadoop namenode -format (or hdfs in the 2.x serie)

the solution was taken from: http://pages.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/. Basically it consists in restarting from scratch, so make sure you won't loose data by formating the hdfs.

answered Oct 05 '22 20:10

giltsl

I ran into the same issue. I have created a hdfs folder '/home/username/hdfs' with sub-directories name, data, and tmp which were referenced in config xml files of hadoop/conf.

When I started hadoop and did jps, I couldn't find datanode so I tried to manually start datanode using bin/hadoop datanode. Then I realized from error message that it has permissions issue accessing the dfs.data.dir=/home/username/hdfs/data/ which was referenced in one of the hadoop config files. All I had to do was stop hadoop, delete the contents of /home/username/hdfs/tmp/* directory and then try this command - chmod -R 755 /home/username/hdfs/ and then start hadoop. I could find the datanode!

answered Oct 05 '22 21:10

sunskin

Related questions
                            
                                When using --negotiate with curl, is a keytab file required?
                            
                                view contents of file in hdfs hadoop
                            
                                List the namenode and datanodes of a cluster from any node?
                            
                                HBase REST Filter ( SingleColumnValueFilter )
                            
                                Why isn't Hadoop implemented using MPI?
                            
                                How do you make a HIVE table out of JSON data?
                            
                                Download large data for Hadoop [closed]
                            
                                What is the relationship between Spark, Hadoop and Cassandra
                            
                                Cannot Read a file from HDFS using Spark
                            
                                How to choose between Cassandra, Membase, Hadoop, MongoDB, RDBMS etc.? [closed]
                            
                                How do I get schema / column names from parquet file?
                            
                                How does Hadoop perform input splits?
                            
                                Why do we need ZooKeeper in the Hadoop stack?
                            
                                Ports are not available: listen tcp 0.0.0.0/50070: bind: An attempt was made to access a socket in a way forbidden by its access permissions
                            
                                SparkSQL vs Hive on Spark - Difference and pros and cons?
                            
                                Why spark-shell fails with NullPointerException?
                            
                                Thrift, Avro, Protocolbuffers - Are they all dead?
                            
                                Setting the number of map tasks and reduce tasks
                            
                                How to get started with Big Data Analysis [closed]
                            
                                Free Large datasets to experiment with Hadoop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With