Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transfer file out from HDFS

I want to transfer files out from HDFS to local filesystem of a different server which is not in hadoop cluster but in the network.

I could have done:

hadoop fs -copyToLocal <src> <dest>
and then scp/ftp <toMyFileServer>.

As the data is huge and due to limited space on local filesystem of hadoop gateway machine, I wanted to avoid this and sent data directly to my file server.

Please help with some pointers on how to handle this issue.

like image 822
dipeshtech Avatar asked Aug 29 '12 08:08

dipeshtech


People also ask

How do I transfer files from HDFS to local?

Hadoop Get command is used to copy files from HDFS to the local file system, use Hadoop fs -get or hdfs dfs -get , on get command, specify the HDFS-file-path where you wanted to copy from and then local-file-path where you wanted a copy to the local file system. Copying files from HDFS file to local file system.

How do I copy a file from Hadoop?

You can use the cp command in Hadoop. This command is similar to the Linux cp command, and it is used for copying files from one directory to another directory within the HDFS file system.

What command is used to copy a file from HDFS to local file system?

You can use the put command in the HDFS. This command is used to copy files from the HDFS file system to the local file system, just the opposite to put command.


2 Answers

This is the simplest way to do it:

ssh <YOUR_HADOOP_GATEWAY> "hdfs dfs -cat <src_in_HDFS> " > <local_dst>

It works for binary files too.

like image 142
cabad Avatar answered Oct 07 '22 09:10

cabad


So you probably have a file with a bunch of parts as the output from your hadoop program.

part-r-00000
part-r-00001
part-r-00002
part-r-00003
part-r-00004

So lets do one part at a time?

for i in `seq 0 4`;
do
hadoop fs -copyToLocal output/part-r-0000$i ./
scp ./part-r-0000$i you@somewhere:/home/you/
rm ./part-r-0000$i
done

You may have to lookup the password modifier for scp

like image 43
Dan Ciborowski - MSFT Avatar answered Oct 07 '22 09:10

Dan Ciborowski - MSFT