When a namenode starts up, it reads HDFS state from an image file, fsimage and then applies the edits from the edit log file.
If I am not wrong , the Name node starts up means when we write start-all.sh. So during this start up time I think it read the fsimage and edit logs and merge them . But from which folder or from which location it actually read both these things?
They will be found on the NameNode, in the NameNode directory which is typically /data/dfs/nn but you can check for the location as per the screenshot below: In the NameNode directory there will be a directory named /current: Copies of both the fsimage*_ and the fsimage*. md5_ files should be provided.
The FsImage is stored as a file in the NameNode's local file system too. The NameNode keeps an image of the entire file system namespace and file Blockmap in memory.
FSimage is a point-in-time snapshot of HDFS's namespace. Edit log records every changes from the last snapshot. The last snapshot is actually stored in FSImage.
When we are starting namenode, latest FsImage file is loaded into "in-memory" and at the same time, EditLog file is also loaded into memory if FsImage file does not contain up to date information. Namenode stores metadata in "in-memory" in order to serve the multiple client request(s) as fast as possible.
In hadoop-1.x start-all.sh script internally performs two operation start-dfs.sh
and start-mapred.sh
. start-dfs.sh will start all daemons required for hdfs ie : datanode, namenode, secondary namenode
The checkpoint operation(applying edit logs to fsimage) happens during namenode start and this activity can be configured during namenode runs by tuning the parameter hdfs-site.xml --> dfs.namenode.checkpoint.period
.
During namenode starts namenode daemons loads fsimage from the directory specified in hdfs-site.xml -> dfs.name.dir.
This property should have been overridden otherwise it would take the default value (file:///tmp/dfs/name/
)
Location of the edit logs can be found by checking the value of hdfs-site.xml -> dfs.name.edits.dir
. default value of dfs.name.edits.dir is ${dfs.name.dir}.
The above property names are changed in hadoop-2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With