My map is currently inefficient when parsing one particular set of files (a total of 2 TB). I'd like to change the block size of files in the Hadoop dfs (from 64MB to 128 MB). I can't find how to do it in the documentation for only one set of files and not the entire cluster.
Which command changes the block size when I upload? (Such as copying from local to dfs.)
You can raise the HDFS block size from the default of 64 MB to 128 MB in order to optimize performance for most use cases. Boosting the block size allows EMC Isilon cluster nodes to read and write HDFS data in larger blocks.
When the block size is small, seek overhead increases as small size of block means the data when divided into blocks will be distributed in more number of blocks and as more blocks are created, there will be more number of seeks to read/write data from/to the blocks.
For me, I had to slightly change Bkkbrad's answer to get it to work with my setup, in case anyone else finds this question later on. I've got Hadoop 0.20 running on Ubuntu 10.10:hadoop fs -D dfs.block.size=134217728 -put local_name remote_location
The setting for me is not fs.local.block.size
but rather dfs.block.size
I change my answer! You just need to set the fs.local.block.size
configuration setting appropriately when you use the command line.
hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location
Original Answer
You can programatically specify the block size when you create a file with the Hadoop API. Unfortunately, you can't do this on the command line with the hadoop fs -put
command. To do what you want, you'll have to write your own code to copy the local file to a remote location; it's not hard, just open a FileInputStream
for the local file, create the remote OutputStream
with FileSystem.create
, and then use something like IOUtils.copy
from Apache Commons IO to copy between the two streams.
you can also modify your block size in your programs like this
Configuration conf = new Configuration() ;
conf.set( "dfs.block.size", 128*1024*1024) ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With