Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Moving files in Hadoop using the Java API?

Tags:

java

hadoop

hdfs

I want to move files around in HDFS using the Java APIs. I cannot figure out a way to do this. The FileSystem class only seems to want to allow moving to and from the local file system.. but I want to keep them in HDFS and move them there.

Am I missing something basic? The only way I can figure to do it is to read it from the input stream and write it back out... and then delete the old copy (yuck).

thanks

like image 360
Wanderer Avatar asked Mar 31 '11 23:03

Wanderer


3 Answers

Use FileSystem.rename():

public abstract boolean rename(Path src, Path dst) throws IOException

Renames Path src to Path dst. Can take place on local fs or remote DFS.

Parameters:
src - path to be renamed
dst - new path after rename
Returns:
true if rename is successful
Throws:
IOException - on failure

like image 75
bajafresh4life Avatar answered Oct 22 '22 23:10

bajafresh4life


The java.nio.* approach may not work on HDFS always. So found the following solution that works.

Move files from one directory to another using org.apache.hadoop.fs.FileUtil.copy API

val fs = FileSystem.get(new Configuration())
        val conf = new org.apache.hadoop.conf.Configuration()
        val srcFs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
        val dstFs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
        val dstPath = new org.apache.hadoop.fs.Path(DEST_FILE_DIR)

        for (file <- fileList) {
          // The 5th parameter indicates whether source should be deleted or not
          FileUtil.copy(srcFs, file, dstFs, dstPath, true, conf)
like image 31
Raj R Avatar answered Oct 22 '22 22:10

Raj R


I think the FileUtilts replaceFile would also solve the purpose. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileUtil.html#replaceFile(java.io.File, java.io.File)

like image 32
Kapil D Avatar answered Oct 22 '22 22:10

Kapil D