Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop - FileSystem.listFiles - not listing directories

Tags:

hadoop

hdfs

I am looking at this method: FileSystem.listFiles(Path f, boolean recursive)

List the statuses and block locations of the files in the given path. If the path is a directory, if recursive is false, returns files in the directory; if recursive is true, return files in the subtree rooted at the path. If the path is a file, return the file's status and block locations.

I am testing the method and it seems it's not returning the sub-directories of a given directory. Is this by design (seems it is though it's java.io counterpart doesn't work that way)? If that limitation is by design, then what are the alternatives, if I want to list all sub-directories too?

Another method FileSystem.listStatus(Path f) is not returning the statuses of the sub-directories too. What am I missing?

like image 253
peter.petrov Avatar asked Jul 03 '14 13:07

peter.petrov


People also ask

How do I list directories in hadoop?

The following arguments are available with hadoop ls command: Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args> Options: -d: Directories are listed as plain files. -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864). -R: Recursively list subdirectories encountered.

How do I view folders in hdfs?

If you type hdfs dfs -ls / you will get list of directories in hdfs.

What does the globStatus () methods return?

The globStatus() methods return an array of FileStatus objects whose paths match the supplied pattern, sorted by path. An optional PathFilter can be specified to restrict the matches further.

Where are the directories created on hdfs?

By default, user's home directory in hdfs exists with '/user/hduser' not as /home/hduser'. If you tried to create directory directly like below then it will be created like '/user/hduser/sampleDir'.


2 Answers

Are you getting any kind of error/exception ??

You might have used the following code:

FileStatus[] status = fs.listStatus(path);
for (int i=0;i<status.length;i++){
    FSDataInputStream fSDataInputStream = fs.open(status[i].getPath());
}
like image 95
abhijitcaps Avatar answered Oct 03 '22 15:10

abhijitcaps


Use FileSystem.listLocatedStatus instead of FileSystem.listStatus if you need to list subdirectories as well as files

like image 32
eumust Avatar answered Oct 03 '22 15:10

eumust