In Hadoop book it is said that we can specify per-file block size at the time of creation of file.
"The most natural way to increase the split size is to have larger blocks in HDFS, by setting dfs.block.size, or on a per-file basis at file construction time."
Any idea how to do this at file construction time. I hope by setting this to value = file-size, the file will not be split
you can use CLI:
hadoop fs -D dfs.block.size=file-size -put local_name remote_location
or you can use Java API to specify the dfs.block.size when you want to create or copy files.
Configuration conf = new Configuration();
conf.setInt("dfs.block.size",file-size);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With