In hadoop getmerge description
Usage: hdfs dfs -getmerge src localdst [addnl]
My question is why getmerge is concatenating to the local destination why not hdfs itself ? This question was asked because i have this following problems
There are four major elements of Hadoop i.e. HDFS, MapReduce, YARN, and Hadoop Common. Most of the tools or solutions are used to supplement or support these major elements.
How Does HDFS Store Data? HDFS divides files into blocks and stores each block on a DataNode. Multiple DataNodes are linked to the master node in the cluster, the NameNode. The master node distributes replicas of these data blocks across the cluster.
HDFS is made for handling large files by dividing them into blocks, replicating them, and storing them in the different cluster nodes. Thus, its ability to be highly fault-tolerant and reliable. HDFS is designed to store large datasets in the range of gigabytes or terabytes, or even petabytes.
The getmerge
command has been created specifically for merging files from HDFS into a single file on local file system.
This command is very useful to download the output of a MapReduce job, which could have generated multiple part-* files and combine them into a single file locally, which you can use for other operations (for e.g. put it in an Excel sheet for presentation).
Answers to your questions:
If the destination file system does not have enough space, then IOException is thrown. The getmerge
internally uses IOUtils.copyBytes()
(see IOUtils.copyBytes()) function to copy one file at a time from HDFS to local file. This function throws IOException
whenever there is an error in the copy operation.
This command is on similar lines as hdfs fs -get
command which gets the file from HDFS to local file system. Only difference is hdfs fs -getmerge
merges multiple files from HDFS to local file system.
If you want to merge multiple files in HDFS, you can achieve it using copyMerge()
method from FileUtil
class (see FileUtil.copyMerge()).
This API copies all files in a directory to a single file (merges all the source files).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With