Are there any suggestions about size of HDD on namenode physical machine? Sure, it does not store any data from HDFS like datanode but what should I depend on while creating cluster?
Physical disk space on the NameNode does not really matter unless you run a Datanode on the same node. However, it is very important to have good memory (RAM) space allocated to the NameNode. This is because the NameNode stores all the metadata of the HDFS (block allocations, block locations etc.), in memory. If sufficient memory is not allocated, the NameNode might run out of memory and fail.
You might need some space to actually store the the NameNode's FSImage, edit file and other relevant files.
It's actually recommended to configure NameNode to use multiple directories (one local and other NFS mount), so that multiple copies of File System metadata will be stored. That way, as long as the directories are on separate disks, a single disk failure will not corrupt the meta-data.
Please see this link for more details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With