Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Copying File in HDFS to another Directory in HDFS

Tags:

java

hadoop

hdfs

I'm using the example in this link here to copy contents from one directory in hdfs to another directory in hdfs. The copying of file works, but it creates a new subdirectory in the target vs. just copying the file to the target dir. Example:

  Path source=new Path("hdfs://HANameService/sources/hpm_support/apc_code/");
  Path target=new Path("hdfs://HANameService/staging/hpm_support/apc_code/");
  FileSystem fs = source.getFileSystem(conf); 
  FileUtil.copy(fs, source, fs, target, true, conf);`

So instead of copying the file to hdfs://HANameService/staging/hpm_support/apc_code it creates a new dir under apc_code and the file ends up in hdfs://HANameService/staging/hpm_support/apc_code/apc_code How can I get it to not create that sub-directory?

like image 755
jymbo Avatar asked May 21 '17 05:05

jymbo


People also ask

How to copy a file from one HDFS folder to another?

As with using rename() you will need to ensure you target directory is created before calling copy. FileUtil.copy() has a signature where you provide a source and destination FS and in this case you would provide the same FS object since you are looking to copy files to a different location on the same HDFS.

How do I copy data from one directory to another in Hadoop?

Hadoop fs cp – Easiest way to copy data from one source directory to another. Use the hadoop fs -cp [source] [destination]. Hadoop fs copyFromLocal – Need to copy data from local file system into HDFS? Use the hadoop fs -copyFromLocal [source] [destination].

Is it possible to rename a file in HDFS?

Hadoop Core HDFS java Scala Spark 1 ACCEPTED SOLUTION gnovak Expert Contributor Created ‎05-24-201807:55 AM Mark as New Bookmark Subscribe Mute Subscribe to RSS Feed Permalink Print Email to a Friend Report Inappropriate Content 05-24-2018 07:55:57 @RAUI The answer is no. Renaming is the way to move files on HDFS: FileSystem.rename().

What is the difference between Hadoop FS and HDFS DFS?

Many commands in HDFS are prefixed with the hdfs dfs – [command] or the legacy hadoop fs – [command]. Although not all hadoop fs commands and hdfs dfs are interchangeable. To ease the confusion, below I have broken down both the hdfs dfs and hadoop fs copy commands. My preference is to use hdfs dfs prefix vs. the hadoop fs.


1 Answers

You need to list the files in the source directory and copy each file using iterator

            Path source=new Path("hdfs://HANameService/sources/hpm_support/apc_code/");
            Path target=new Path("hdfs://HANameService/staging/hpm_support/apc_code/");
            FileSystem fs = source.getFileSystem(conf);
            RemoteIterator<LocatedFileStatus> sourceFiles = fs.listFiles(source, true);
            if(sourceFiles != null) {
                while(sourceFiles.hasNext()){
                    FileUtil.copy(fs, sourceFiles.next().getPath(), fs, target, true, conf);
                }           
            }

Hope it is helpful to you

like image 146
Ramesh Maharjan Avatar answered Nov 08 '22 01:11

Ramesh Maharjan