I am looking at this method: FileSystem.listFiles(Path f, boolean recursive)
List the statuses and block locations of the files in the given path. If the path is a directory, if recursive is false, returns files in the directory; if recursive is true, return files in the subtree rooted at the path. If the path is a file, return the file's status and block locations.
I am testing the method and it seems it's not returning the sub-directories of a given directory. Is this by design (seems it is though it's java.io counterpart doesn't work that way)? If that limitation is by design, then what are the alternatives, if I want to list all sub-directories too?
Another method FileSystem.listStatus(Path f)
is not returning the statuses of the sub-directories too. What am I missing?
The following arguments are available with hadoop ls command: Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args> Options: -d: Directories are listed as plain files. -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864). -R: Recursively list subdirectories encountered.
If you type hdfs dfs -ls / you will get list of directories in hdfs.
The globStatus() methods return an array of FileStatus objects whose paths match the supplied pattern, sorted by path. An optional PathFilter can be specified to restrict the matches further.
By default, user's home directory in hdfs exists with '/user/hduser' not as /home/hduser'. If you tried to create directory directly like below then it will be created like '/user/hduser/sampleDir'.
Are you getting any kind of error/exception ??
You might have used the following code:
FileStatus[] status = fs.listStatus(path);
for (int i=0;i<status.length;i++){
FSDataInputStream fSDataInputStream = fs.open(status[i].getPath());
}
Use FileSystem.listLocatedStatus instead of FileSystem.listStatus if you need to list subdirectories as well as files
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With