What the command "hadoop namenode -format" will do

Tags:

I am trying to learn Hadoop by following a tutorial and trying to do pseudo-distributed mode on my machine.

My core-site.xml is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>       <name>fs.default.name</name>       <value>hdfs://localhost:9000</value>       <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation.              </description>       </property> </configuration>

My hdfs-site.xml file is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>       <name>dfs.replication</name>       <value>1</value>       <description>The actual number of replications can be specified when the         file is created.       </description>    </property> </configuration>

My mapred-site.xml file is:

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>    <property>             <name>mapred.job.tracker</name>       <value>localhost:9001</value>       <description>The host and port that the MapReduce job tracker runs         at.       </description>    </property> </configuration>

When I run the command it ran successfully but what it is doing actually:

hadoop-1.2.1$ bin/hadoop namenode -format 14/11/26 12:37:16 INFO namenode.NameNode: STARTUP_MSG:  /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG:   host = myhost/127.0.0.8 STARTUP_MSG:   args = [-format] STARTUP_MSG:   version = 1.2.1 STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG:   java = 1.6.0_45 ************************************************************/ 14/11/26 12:37:17 INFO util.GSet: Computing capacity for map BlocksMap 14/11/26 12:37:17 INFO util.GSet: VM type       = 64-bit 14/11/26 12:37:17 INFO util.GSet: 2.0% max memory = 932118528 14/11/26 12:37:17 INFO util.GSet: capacity      = 2^21 = 2097152 entries 14/11/26 12:37:17 INFO util.GSet: recommended=2097152, actual=2097152 14/11/26 12:37:17 INFO namenode.FSNamesystem: fsOwner=myuser 14/11/26 12:37:17 INFO namenode.FSNamesystem: supergroup=supergroup 14/11/26 12:37:17 INFO namenode.FSNamesystem: isPermissionEnabled=true 14/11/26 12:37:17 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 14/11/26 12:37:17 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/11/26 12:37:17 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 14/11/26 12:37:17 INFO namenode.NameNode: Caching file names occuring more than 10 times  14/11/26 12:37:17 INFO common.Storage: Image file /tmp/hadoop-myuser/dfs/name/current/fsimage of size 115 bytes saved in 0 seconds. 14/11/26 12:37:18 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO common.Storage: Storage directory /tmp/hadoop-myuser/dfs/name has been successfully formatted. 14/11/26 12:37:18 INFO namenode.NameNode: SHUTDOWN_MSG:  /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at chaitanya-OptiPlex-3010/127.0.0.8 ************************************************************/

Can someone please let me know what it is doing internally.

I have gone through these posts but there is no correct explanation.

What exactly is hadoop namenode formatting?

hadoop namenode is not formatting

How can I check this practically on my machine so I can see the differences before and after running the command. I am new to Hadoop so this can be a trivial question.

353

asked Nov 26 '14 07:11

learner

1 Answers

hadoop namenode -format this command deletes all files in your hdfs.

tmp directory contains two folders datanode, namenode in local filesystem. if you format the namenode these two folders becomes empty.

Note : if you want to format your namenode first stop all hadoop services then delete the tmp(contains namenode and datanode) folder in your local file system and start hadoop service surely it will take effect.

Reason for Hadoop namenode -format :

Hadoop NameNode is the centralized place of an HDFS file system which keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. In short, it keeps the metadata related to datanodes. When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data.

By default the namenode location will be at "/tmp/hadoop-myuser/dfs/name"

While you formatting the namenode, this file location was cleared.

To change the namenode location add the follwing properties At hdfs-site.xml

<property>     <name>dfs.namenode.name.dir</name>     <value>file:/search/data/dfs/namenode</value> </property> <property>     <name>dfs.datanode.data.dir</name>     <value>file:/search/data/dfs/datanode</value> </property>

I hope this will help you.. :-)

139

answered Dec 07 '22 19:12

Suresh Ram

Related questions
                            
                                Git checkout -b, branch already exists
                            
                                Is multiple inheritance from the same base class via different parent classes really an issue here?
                            
                                Neo4j: label vs. indexed property?
                            
                                ASP.NET MVC 5 group of radio buttons
                            
                                How to change location of Influxdb storage folder?
                            
                                UIRefreshControl needs to be pulled down too far
                            
                                maven: what does ` -U,--update-snapshots` really do?
                            
                                How do I add a kernel on a remote machine in IPython (Jupyter) Notebook?
                            
                                How can you use a variable name inside a Python format specifier
                            
                                Pry Error: Cannot find local context. Did you use `binding.pry`?
                            
                                Seeding random number generators in parallel programs
                            
                                LoadError: cannot load such file -- rspec/core/rake_task

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With