Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Is a Block in HDFS So Large?

Tags:

Can somebody explain this calculation and give a lucid explanation?

A quick calculation shows that if the seek time is around 10 ms and the transfer rate is 100 MB/s, to make the seek time 1% of the transfer time, we need to make the block size around 100 MB. The default is actually 64 MB, although many HDFS installations use 128 MB blocks. This figure will continue to be revised upward as transfer speeds grow with new generations of disk drives.

like image 629
Kumar Avatar asked Mar 12 '14 13:03

Kumar


People also ask

What is default block size in HDFS and why is it so large?

The default size of a block in HDFS is 128 MB (Hadoop 2. x) and 64 MB (Hadoop 1. x) which is much larger as compared to the Linux system where the block size is 4KB. The reason of having this huge block size is to minimize the cost of seek and reduce the meta data information generated per block.

Why block size is 64MB in Hadoop?

The reason Hadoop chose 64MB was because Google chose 64MB. The reason Google chose 64MB was due to a Goldilocks argument. Having a much smaller block size would cause seek overhead to increase.

Why did Hadoop block size increase?

Reasons for the increase in Block size from 64MB to 128MB are as follows: To improve the NameNode performance. To improve the performance of MapReduce job since the number of the mapper is directly dependent on Block size.

Why is data block size set to 128 MB in Hadoop?

Also the block size should not be very large such that , parallelism cant be achieved. It should not be such that the system is waiting a very long time for one unit of data processing to finish its work. A balance needs to be maintained. That's why the default block size is 128 MB.


2 Answers

A block will be stored as a contiguous piece of information on the disk, which means that the total time to read it completely is the time to locate it (seek time) + the time to read its content without doing any more seeks, i.e. sizeOfTheBlock / transferRate = transferTime.

If we keep the ratio seekTime / transferTime small (close to .01 in the text), it means we are reading data from the disk almost as fast as the physical limit imposed by the disk, with minimal time spent looking for information.

This is important since in map reduce jobs we are typically traversing (reading) the whole data set (represented by an HDFS file or folder or set of folders) and doing logic on it, so since we have to spend the full transferTime anyway to get all the data out of the disk, let's try to minimise the time spent doing seeks and read by big chunks, hence the large size of the data blocks.

In more traditional disk access software, we typically do not read the whole data set every time, so we'd rather spend more time doing plenty of seeks on smaller blocks rather than losing time transferring too much data that we won't need.

like image 159
Svend Avatar answered Sep 28 '22 12:09

Svend


Since 100mb is divided into 10 blocks you gotta do 10 seeks and transfer rate is (10/100)mb/s for each file. (10ms*10) + (10/100mb/s)*10 = 1.1 sec. which is greater than 1.01 anyway.

like image 29
Shanker Lolakapuri Avatar answered Sep 28 '22 12:09

Shanker Lolakapuri