Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Backup hdfs directory from full-distributed to a local directory?

I'm trying to back up a directory from hdfs to a local directory. I have a hadoop/hbase cluster running on ec2. I managed to do what I want running in pseudo-distributed on my local machine but now I'm fully distributed the same steps are failing. Here is what worked for pseudo-distributed

hadoop distcp hdfs://localhost:8020/hbase file:///Users/robocode/Desktop/

Here is what I'm trying on the hadoop namenode (hbase master) on ec2

ec2-user@ip-10-35-53-16:~$ hadoop distcp hdfs://10.35.53.16:8020/hbase file:///~/hbase

The errors I'm getting are below

13/04/19 09:07:40 INFO tools.DistCp: srcPaths=[hdfs://10.35.53.16:8020/hbase]
13/04/19 09:07:40 INFO tools.DistCp: destPath=file:/~/hbase
13/04/19 09:07:41 INFO tools.DistCp: file:/~/hbase does not exist.
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Failed to createfile:/~/hbase
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1171)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
like image 977
Philip O'Brien Avatar asked Jan 24 '26 20:01

Philip O'Brien


1 Answers

You can't use the ~ character in Java to represent the current home directory, so change to a fully qualified path, e.g.:

file:///home/user1/hbase

But i think you're going to run into problems in a fully distributed environment as the distcp command runs a map reduce job, so the destination path will be interpreted as local to each cluster node.

If you want to pull data down from HDFS to a local directory, you'll need to use the -get or -copyToLocal switches to the hadoop fs command

like image 111
Chris White Avatar answered Jan 27 '26 13:01

Chris White