Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Namenode not getting started

Tags:

hadoop

hdfs

People also ask

How do I manually start NameNode?

You can stop the NameNode individually using /sbin/hadoop-daemon.sh stop namenode command. Then start the NameNode using /sbin/hadoop-daemon.sh start namenode. Use /sbin/stop-all.sh and the use /sbin/start-all.sh, command which will stop all the demons first. Then start all the daemons.

What happen if NameNode fails?

If NameNode fails, the entire Hadoop cluster will fail. Actually, there will be no data loss, only the cluster job will be shut down because NameNode is just the point of contact for all DataNodes and if the NameNode fails then all communication will stop.

What are the problem with NameNode?

The single point of failure in Hadoop v1 is NameNode. If NameNode gets fail the whole Hadoop cluster will not work. Actually, there will not any data loss only the cluster work will be shut down, because NameNode is only the point of contact to all DataNodes and if the NameNode fails all communication will stop.


I was facing the issue of namenode not starting. I found a solution using following:

  1. first delete all contents from temporary folder: rm -Rf <tmp dir> (my was /usr/local/hadoop/tmp)
  2. format the namenode: bin/hadoop namenode -format
  3. start all processes again:bin/start-all.sh

You may consider rolling back as well using checkpoint (if you had it enabled).


hadoop.tmp.dir in the core-site.xml is defaulted to /tmp/hadoop-${user.name} which is cleaned after every reboot. Change this to some other directory which doesn't get cleaned on reboot.


Following STEPS worked for me with hadoop 2.2.0,

STEP 1 stop hadoop

hduser@prayagupd$ /usr/local/hadoop-2.2.0/sbin/stop-dfs.sh

STEP 2 remove tmp folder

hduser@prayagupd$ sudo rm -rf /app/hadoop/tmp/

STEP 3 create /app/hadoop/tmp/

hduser@prayagupd$ sudo mkdir -p /app/hadoop/tmp
hduser@prayagupd$ sudo chown hduser:hadoop /app/hadoop/tmp
hduser@prayagupd$ sudo chmod 750 /app/hadoop/tmp

STEP 4 format namenode

hduser@prayagupd$ hdfs namenode -format

STEP 5 start dfs

hduser@prayagupd$ /usr/local/hadoop-2.2.0/sbin/start-dfs.sh

STEP 6 check jps

hduser@prayagupd$ $ jps
11342 Jps
10804 DataNode
11110 SecondaryNameNode
10558 NameNode

In conf/hdfs-site.xml, you should have a property like

<property>
    <name>dfs.name.dir</name>
    <value>/home/user/hadoop/name/data</value>
</property>

The property "dfs.name.dir" allows you to control where Hadoop writes NameNode metadata. And giving it another dir rather than /tmp makes sure the NameNode data isn't being deleted when you reboot.


Open a new terminal and start the namenode using path-to-your-hadoop-install/bin/hadoop namenode

The check using jps and namenode should be running


Why do most answers here assume that all data needs to be deleted, reformatted, and then restart Hadoop? How do we know namenode is not progressing, but taking lots of time. It will do this when there is a large amount of data in HDFS. Check progress in logs before assuming anything is hung or stuck.

$ [kadmin@hadoop-node-0 logs]$ tail hadoop-kadmin-namenode-hadoop-node-0.log

...
016-05-13 18:16:44,405 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 117/141 transactions completed. (83%)
2016-05-13 18:16:56,968 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 121/141 transactions completed. (86%)
2016-05-13 18:17:06,122 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 122/141 transactions completed. (87%)
2016-05-13 18:17:38,321 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 123/141 transactions completed. (87%)
2016-05-13 18:17:56,562 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 124/141 transactions completed. (88%)
2016-05-13 18:17:57,690 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 127/141 transactions completed. (90%)

This was after nearly an hour of waiting on a particular system. It is still progressing each time I look at it. Have patience with Hadoop when bringing up the system and check logs before assuming something is hung or not progressing.


In core-site.xml:

    <configuration>
       <property>
          <name>fs.defaultFS</name>
          <value>hdfs://localhost:9000</value>
       </property>
       <property>
          <name>hadoop.tmp.dir</name>
          <value>/home/yourusername/hadoop/tmp/hadoop-${user.name}
         </value>
  </property>
</configuration>

and format of namenode with :

hdfs namenode -format

worked for hadoop 2.8.1