Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying file from HDFS to Local Machine

Tags:

java

hadoop

hdfs

I'm having a problem with trying to "download" file from HDFS file system to my local system. (even though opposite operation works without a problem). *Note: File exists on the HDFS file system on specified path

Here is a code snippet:

    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "${NAMENODE_URI}");
    FileSystem hdfsFileSystem = FileSystem.get(conf);

    String result = "";

    Path local = new Path("${SOME_LOCAL_PATH}");
    Path hdfs = new Path("${SOME_HDFS_PATH}");

    String fileName = hdfs.getName();

    if (hdfsFileSystem.exists(hdfs))
    {
        hdfsFileSystem.copyToLocalFile(hdfs, local);
        result = "File " + fileName + " copied to local machine on location: " + localPath;
    }
    else
    {
        result = "File " + fileName + " does not exist on HDFS on location: " + localPath;
    }

    return result;

Exception that I get is following:

12/07/13 14:57:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.io.IOException: Cannot run program "cygpath": CreateProcess error=2, The system cannot find the file specified
    at java.lang.ProcessBuilder.start(Unknown Source)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:206)
    at org.apache.hadoop.util.Shell.run(Shell.java:188)
    at org.apache.hadoop.fs.FileUtil$CygPathCommand.<init>(FileUtil.java:412)
    at org.apache.hadoop.fs.FileUtil.makeShellPath(FileUtil.java:438)
    at org.apache.hadoop.fs.FileUtil.makeShellPath(FileUtil.java:465)
    at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:573)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:565)
    at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:403)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:452)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:420)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:774)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:755)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:654)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:259)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:232)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:183)
    at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1837)
    at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1806)
    at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1782)
    at com.hmeter.hadoop.hdfs.hdfsoperations.HdfsOperations.fileCopyFromHdfsToLocal(HdfsOperations.java:75)
    at com.hmeter.hadoop.hdfs.hdfsoperations.HdfsOperations.main(HdfsOperations.java:148)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
    at java.lang.ProcessImpl.create(Native Method)
    at java.lang.ProcessImpl.<init>(Unknown Source)
    at java.lang.ProcessImpl.start(Unknown Source)
    ... 22 more

Any idea what could be an issue? Why it is requiring the cyqpath for Cygwin? I'm running this code on Windows 7.

Thanks

like image 906
Bakir Jusufbegovic Avatar asked Jul 13 '12 13:07

Bakir Jusufbegovic


1 Answers

You can follow code shown below:

public static void main(String args[]){
    try {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:54310/user/hadoop/");
        FileSystem fs = FileSystem.get(conf);
        FileStatus[] status = fs.listStatus(new Path("hdfsdirectory"));
        for(int i=0;i<status.length;i++){
            System.out.println(status[i].getPath());
            fs.copyToLocalFile(false, status[i].getPath(), new Path("localdir"));
        }
    } catch (IOException e) {
        e.printStackTrace();
    }

}
like image 163
Animesh Raj Jha Avatar answered Oct 12 '22 23:10

Animesh Raj Jha