Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop per-file block size

In Hadoop book it is said that we can specify per-file block size at the time of creation of file.

"The most natural way to increase the split size is to have larger blocks in HDFS, by setting dfs.block.size, or on a per-file basis at file construction time."

Any idea how to do this at file construction time. I hope by setting this to value = file-size, the file will not be split

like image 862
sunillp Avatar asked Feb 07 '12 06:02

sunillp


1 Answers

you can use CLI:

hadoop fs -D dfs.block.size=file-size -put local_name remote_location

or you can use Java API to specify the dfs.block.size when you want to create or copy files.

Configuration conf = new Configuration();
conf.setInt("dfs.block.size",file-size);
like image 113
owen wang Avatar answered Oct 05 '22 06:10

owen wang