On the https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html there are words:
the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.
But why is this information sent to the namenode and its fallback brother? I thought that this information already contains in the namenode's fs image. The namenode should know where he put blocks.
NameNode that receives the Heartbeats from a DataNode also carries information like total storage capacity, the fraction of storage in use, and the number of data transfers currently in progress. For the NameNode's block allocation and load balancing decisions, we use these statistics.
All communication between Namenode and Datanode is initiated by the Datanode, and responded to by the Namenode. The Namenode never initiates communication to the Datanode, although Namenode responses may include commands to the Datanode that cause it to send further communications.
Namenode periodically receives a heartbeat and a Block report from each Datanode in the cluster. Every Datanode sends heartbeat message after every 3 seconds to Namenode.
DataNodes sends information to the NameNode about the files and blocks stored in that node and responds to the NameNode for all filesystem operations.
Name Node contains the meta data of the entire cluster. It contains the details of each folder, file, replication factor, block names etc. The Name Node also stores the information about the location of the blocks for each file (this information is constructed from the Block Reports sent by the Data Nodes) in memory.
Data Nodes store following information for each block:
They periodically send the heart beat and block reports to the Name Node.
Heart Beat:
dfs.heartbeat.interval
(in hdfs-site.xml). By default this is set to 3 seconds.BlockRecoveryCommand
(to recover specified blocks), BlockCommand
(for transferring blocks to another Data Node, for invalidating certain blocks), Cache/Uncache
(commands for caching / uncaching the blocks)Block Reports:
dfs.blockreport.intervalMsec
(in hdfs-site.xml). By default this is set to 21600000 milliseconds.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With