I am trying NN metadata recovery. I have taken backup of Namenode and Journal node meta data . It contains edit logs and fsimages.
There are two NNs in my system. I take backup of metadata on both NNs (hdfs metadata & QJM metadata) at regular frequency. I want to test recovery procedure in a worst case scenario. Assume both the NNs and Journal node are down with the metadata completely deleted.
I want to recover NN metadata from backup and start NN. I know that there could be a data loss as the latest changes done after backup would be missing.
Questions:
Steps tried:
Alternative approach: Restore all the edit logs and fsimage to both hdfs and qjm directories and start NN but still it fails.
Both the NNs are down and I can't bring up. I don't want to format hdfs as it will change Cluster ID and the backup won't be usable.
Exceptions:
Because the latest FsImage and Edit has been lost or corrupted, you should try to recovery the Metadata
./bin/hadoop namenode -recover
Refer: NameNode Recovery Tools for the Hadoop Distributed File System
Because the journal is not sync with the namenode, you should reinit it
./bin/hdfs namenode -initializeSharedEdits
Because the recovered FsImage has lost the latest data the updated since last backup, you should check and delete the corrupted data
./bin/hadoop fsck -delete /
If you do not do fsck, the namenode may be stuck in safe mode, for too many unresponsive blocks.
You can start namenode with recover flag enabled. Namenode recover will take care of corrupt maetadata.
./bin/hadoop namenode -recover
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With