Does HDFS have writing consistency just like Cassandra, let's say that when I finished writing one file into HDFS, when would I get successful response, is it when the first replication completed or 3 replications completed? (suppose the rep=3)
It works differently in Hadoop
compared to Cassandra
.
You have two parameters related to replication.
dfs.replication
: Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time
dfs.namenode.replication.min
: Minimal block replication.
Once dfs.namenode.replication.min
has been met, write operation will be treated as successful.
But this replication up to dfs.replication
will happen in sequential pipeline. First Datanode writes the block and forward it to second Datanode. Second Datanode writes the block and forward it to third Datanode.
DFSOutputStream
also maintains an internal queue of packets that are waiting to be acknowledged by datanodes, called the ack queue.
A packet is removed from the ack queue only when it has been acknowledged by all the Datanodes in the pipeline.
You can find more details in related SE question:
Hadoop 2.0 data write operation acknowledgement
In a normal process, the client will wait until all the replicas sent the "acknowledge" for all the data packages. But if at the write process a DataNode fail, the client will continue writing the data to the remaining DataNodes and it will success if the number of DataNodes that acknowledge the reception of all the packages is equal or great that the minimum number of replicas (default 1). In this case since the number of replicas is less than the required number of replicas, the blocks will be marked as unreplicated and the NameNode will replicate them asynchronously.
Then, the write process could return a success even not all the required replicas were created.
If you want to force to only success if all replicas were created, you can set the property dfs.namenode.replication.min (default 1) equal than dfs.replication (default 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With