data block size in HDFS, why 64MB?

1 Answers

What does 64MB block size mean?

The block size is the smallest data unit that a file system can store. If you store a file that's 1k or 60Mb, it'll take up one block. Once you cross the 64Mb boundary, you need a second block.

If yes, what is the advantage of doing that?

HDFS is meant to handle large files. Let's say you have a 1000Mb file. With a 4k block size, you'd have to make 256,000 requests to get that file (1 request per block). In HDFS, those requests go across a network and come with a lot of overhead. Each request has to be processed by the Name Node to determine where that block can be found. That's a lot of traffic! If you use 64Mb blocks, the number of requests goes down to 16, significantly reducing the cost of overhead and load on the Name Node.

149

answered Oct 02 '22 09:10

bstempi

Related questions
                            
                                Show values from a MySQL database table inside a HTML table on a webpage
                            
                                How do you create a foreign key relationship in a SQL Server CE (Compact Edition) Database?
                            
                                Restore database backup over the network
                            
                                Create date from day, month, year fields in MySQL
                            
                                How to delete object from Realm Database Android?
                            
                                In terms of databases, is "Normalize for correctness, denormalize for performance" a right mantra?
                            
                                #1146 - Table 'phpmyadmin.pma_recent' doesn't exist
                            
                                Why use SQLAlchemy? Is it very convinent for coding? [closed]
                            
                                How do you unit test your T-SQL [closed]
                            
                                Creating multiple databases on one server using Neo4j
                            
                                Database structure and source control - best practice
                            
                                What is the most clever and easy approach to sync data between multiple entities?
                            
                                ORA-01652: unable to extend temp segment by 128 in tablespace SYSTEM: How to extend?
                            
                                Database partitioning - Horizontal vs Vertical - Difference between Normalization and Row Splitting?
                            
                                How to do database unit testing?
                            
                                Can a database table be without a primary key?
                            
                                mongoose difference of findOneAndUpdate and update
                            
                                Compare a date string to datetime in SQL Server?
                            
                                What are the benefits of using database cursor?
                            
                                Check if list contains item from other list in EntityFramework

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

data block size in HDFS, why 64MB?

Tags:

database

hadoop

block

mapreduce

hdfs

dykw

People also ask

1 Answers

bstempi

Recent Activity

Donate For Us