Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what Hadoop will do after one of datanodes down

Tags:

hadoop

I have 10 data noes and 2 name nodes Hadoop cluster with replicates configured 3, I was wondering if one of data nodes goes down, will hadoop try to generate the lost replicates on the other alive nodes? or just do nothing(since still have 2 replicas left).

Add, what if the down data node come back after a while, can hadoop recognize the data on that node? Thanks!

like image 418
Jack Avatar asked Dec 25 '15 02:12

Jack


Video Answer


1 Answers

will hadoop try to generate the lost replicates on the other alive nodes? or just do nothing(since still have 2 replicas left).

Yes, Hadoop will recognize it and make copies of that data on some other nodes. When Namenode stop receiving heart beats from the data nodes, it assumes that data node is lost. To keep the replication of the all the data to defined replication factor, it will make the copies on other data nodes.

Add, what if the down data node come back after a while, can hadoop recognize the data on that node?

Yes, when a data node comes back with all its data, Name node will remove/delete the extra copies of data. In the next heart beat to the data node, Name node will send the instruction to remove the extra data and free up the space on disk.

Snippet from Apache HDFS documentation:

Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.

like image 63
YoungHobbit Avatar answered Sep 23 '22 18:09

YoungHobbit