Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop HDFS maximum file size

Tags:

hadoop

hdfs

A colleague of mine thinks that HDFS has no maximum file size, i.e., by partitioning into 128 / 256 meg chunks any file size can be stored (obviously the HDFS disk has a size and that will limit, but is that the only limit). I can't find anything saying that there is a limit so is she correct?

thanks, jim

like image 614
jimmyb Avatar asked Mar 31 '11 00:03

jimmyb


People also ask

How do I increase file size in HDFS?

For maximum file size, you cannot do much except for a block size (each file having multiple blocks). There is no limit to a file size.

How will a file of 100mb be stored in Hadoop?

To store 100 files i.e. 100 MB data we need to make use of 15 x 100 = 1500 bytes of memory in Name Node RAM memory. Consider another file “IdealFile” of size 100 MB, we need one block here i.e. B1 that is stored in Machine 1, Machine 2 , Machine 3. This will occupy 150 MB memory in Name Node RAM.

What is file size in HDFS?

Files in HDFS are broken into block-sized chunks called data blocks. These blocks are stored as independent units. The size of these HDFS data blocks is 128 MB by default.

What is the default HDFS block size 32 MB 64 KB 128 KB 64 MB?

The size of the data block in HDFS is 64 MB by default, which can be configured manually.


2 Answers

Well there is obviously a practical limit. But physically HDFS Block IDs are Java longs so they have a max of 2^63 and if your block size is 64 MB then the maximum size is 512 yottabytes.

like image 137
Rich Garris Avatar answered Nov 09 '22 12:11

Rich Garris


I think she's right about saying there's no maximum file size on HDFS. The only thing you can really set is the chunk size, which is 64 MB by default. I guess sizes of any length can be stored, the only constraint could be that the bigger the size of the file, the greater the hardware to accommodate it.

like image 34
Vinayak Ponangi Avatar answered Nov 09 '22 13:11

Vinayak Ponangi