Is there any way to change the replication factor of a directory in Hadoop when I expect the change to be applicable on the files which will be written to that directory in the future?
You can change the replication factor of a file using command:
hdfs dfs –setrep –w 3 /user/hdfs/file.txt
You can also change the replication factor of a directory using command:
hdfs dfs -setrep -R 2 /user/hdfs/test
But changing the replication factor for a directory will only affect the existing files and the new files under the directory will get created with the default replication factor (dfs.replication from hdfs-site.xml
) of the cluster.
Please see the link to understand more on it.
Please see link to configure replication factor for HDFS.
But you can temporarily override and turn off the HDFS default replication factor by passing:
-D dfs.replication=1
This should work well when you pass it with a Map/Reduce job. This will be your job specific only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With