Small files and HDFS blocks

1 Answers

Multiple files are not stored in a single block. BTW, a single file can be stored in multiple blocks. The mapping between the file and the block-ids is persisted in the NameNode.

According to the Hadoop : The Definitive Guide

Unlike a filesystem for a single disk, a file in HDFS that is smaller than a single block does not occupy a full block’s worth of underlying storage.

HDFS is designed to handle large files. If there are too many small files then the NameNode might get loaded since it stores the name space for HDFS. Check this article on how to alleviate the problem with too many small files.

153

answered Sep 17 '22 15:09

Praveen Sripati

Related questions
                            
                                Create directory in hadoop filesystem
                            
                                Why does my yarn application not have logs even with logging enabled?
                            
                                Hadoop JobConf class is deprecated , need updated example
                            
                                Import data from HDFS to HBase (cdh3u2)
                            
                                Mapreduce for dummies
                            
                                Hadoop namenode needs to be formatted after every computer start
                            
                                No partition predicate found for Alias even when the partition predicate in present in the query
                            
                                What is Lineage In Spark?
                            
                                Hbase mapreduce error
                            
                                What is Memory reserved on Yarn
                            
                                How does Apache Flink compare to Mapreduce on Hadoop?
                            
                                How does Hive stores data and what is SerDe?
                            
                                Moving data to hdfs using copyFromLocal switch
                            
                                Accessing a mapper's counter from a reducer
                            
                                java.sql.SQLException: No suitable driver found for jdbc:hive://localhost:10000/default
                            
                                Store images/videos into Hadoop HDFS
                            
                                Hadoop put performance - large file (20gb)
                            
                                what are the replacement for hadoop Job deprecated class
                            
                                Hadoop WordCount example stuck at map 100% reduce 0%
                            
                                How do I delete files in hdfs directory after reading it using scala?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Small files and HDFS blocks

Tags:

hadoop

hdfs

Eugen

People also ask

1 Answers

Praveen Sripati

Recent Activity

Donate For Us