Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What the command "hadoop namenode -format" will do

Tags:

I am trying to learn Hadoop by following a tutorial and trying to do pseudo-distributed mode on my machine.

My core-site.xml is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>       <name>fs.default.name</name>       <value>hdfs://localhost:9000</value>       <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation.              </description>       </property> </configuration> 

My hdfs-site.xml file is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>       <name>dfs.replication</name>       <value>1</value>       <description>The actual number of replications can be specified when the         file is created.       </description>    </property> </configuration> 

My mapred-site.xml file is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>             <name>mapred.job.tracker</name>       <value>localhost:9001</value>       <description>The host and port that the MapReduce job tracker runs         at.       </description>    </property> </configuration> 

When I run the command it ran successfully but what it is doing actually:

hadoop-1.2.1$ bin/hadoop namenode -format 14/11/26 12:37:16 INFO namenode.NameNode: STARTUP_MSG:  /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG:   host = myhost/127.0.0.8 STARTUP_MSG:   args = [-format] STARTUP_MSG:   version = 1.2.1 STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG:   java = 1.6.0_45 ************************************************************/ 14/11/26 12:37:17 INFO util.GSet: Computing capacity for map BlocksMap 14/11/26 12:37:17 INFO util.GSet: VM type       = 64-bit 14/11/26 12:37:17 INFO util.GSet: 2.0% max memory = 932118528 14/11/26 12:37:17 INFO util.GSet: capacity      = 2^21 = 2097152 entries 14/11/26 12:37:17 INFO util.GSet: recommended=2097152, actual=2097152 14/11/26 12:37:17 INFO namenode.FSNamesystem: fsOwner=myuser 14/11/26 12:37:17 INFO namenode.FSNamesystem: supergroup=supergroup 14/11/26 12:37:17 INFO namenode.FSNamesystem: isPermissionEnabled=true 14/11/26 12:37:17 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 14/11/26 12:37:17 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/11/26 12:37:17 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 14/11/26 12:37:17 INFO namenode.NameNode: Caching file names occuring more than 10 times  14/11/26 12:37:17 INFO common.Storage: Image file /tmp/hadoop-myuser/dfs/name/current/fsimage of size 115 bytes saved in 0 seconds. 14/11/26 12:37:18 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO common.Storage: Storage directory /tmp/hadoop-myuser/dfs/name has been successfully formatted. 14/11/26 12:37:18 INFO namenode.NameNode: SHUTDOWN_MSG:  /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at chaitanya-OptiPlex-3010/127.0.0.8 ************************************************************/ 

Can someone please let me know what it is doing internally.

I have gone through these posts but there is no correct explanation.

What exactly is hadoop namenode formatting?

hadoop namenode is not formatting

How can I check this practically on my machine so I can see the differences before and after running the command. I am new to Hadoop so this can be a trivial question.

like image 353
learner Avatar asked Nov 26 '14 07:11

learner


People also ask

What does Hadoop NameNode format do?

Answer (1) Hadoop Namenode is used to specify the default file system and also the defaults of your local file system.So, you need to set it to a HDFS address. This is essential for configuration of client and your Local File system. filesystem.

What is the command to format NameNode?

Formatting any NameNode with already existing namespaces could result in data loss. Format the active NameNode by specifying the Cluster ID. The Cluster ID must be the same as that of the existing namespaces. Bootstrap the standby NameNode as specified.

Why do we need to format NameNode?

When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data.

How do I run NameNode in Hadoop?

Run the command % $HADOOP_INSTALL/hadoop/bin/start-dfs.sh on the node you want the Namenode to run on. This will bring up HDFS with the Namenode running on the machine you ran the command on and Datanodes on the machines listed in the slaves file mentioned above.


1 Answers

hadoop namenode -format this command deletes all files in your hdfs.

tmp directory contains two folders datanode, namenode in local filesystem. if you format the namenode these two folders becomes empty.

Note : if you want to format your namenode first stop all hadoop services then delete the tmp(contains namenode and datanode) folder in your local file system and start hadoop service surely it will take effect.

Reason for Hadoop namenode -format :

Hadoop NameNode is the centralized place of an HDFS file system which keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. In short, it keeps the metadata related to datanodes. When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data.

By default the namenode location will be at "/tmp/hadoop-myuser/dfs/name"

While you formatting the namenode, this file location was cleared.

To change the namenode location add the follwing properties At hdfs-site.xml

<property>     <name>dfs.namenode.name.dir</name>     <value>file:/search/data/dfs/namenode</value> </property> <property>     <name>dfs.datanode.data.dir</name>     <value>file:/search/data/dfs/datanode</value> </property> 

I hope this will help you.. :-)

like image 139
Suresh Ram Avatar answered Dec 07 '22 19:12

Suresh Ram