I'm trying to back up a directory from hdfs to a local directory. I have a hadoop/hbase cluster running on ec2. I managed to do what I want running in pseudo-distributed on my local machine but now I'm fully distributed the same steps are failing. Here is what worked for pseudo-distributed
hadoop distcp hdfs://localhost:8020/hbase file:///Users/robocode/Desktop/
Here is what I'm trying on the hadoop namenode (hbase master) on ec2
ec2-user@ip-10-35-53-16:~$ hadoop distcp hdfs://10.35.53.16:8020/hbase file:///~/hbase
The errors I'm getting are below
13/04/19 09:07:40 INFO tools.DistCp: srcPaths=[hdfs://10.35.53.16:8020/hbase]
13/04/19 09:07:40 INFO tools.DistCp: destPath=file:/~/hbase
13/04/19 09:07:41 INFO tools.DistCp: file:/~/hbase does not exist.
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Failed to createfile:/~/hbase
at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1171)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
You can't use the ~ character in Java to represent the current home directory, so change to a fully qualified path, e.g.:
file:///home/user1/hbase
But i think you're going to run into problems in a fully distributed environment as the distcp command runs a map reduce job, so the destination path will be interpreted as local to each cluster node.
If you want to pull data down from HDFS to a local directory, you'll need to use the -get or -copyToLocal switches to the hadoop fs command
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With