Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a hdfs command to list files in HDFS directory as per timestamp

Tags:

hadoop

hdfs

Is there a hdfs command to list files in HDFS directory as per timestamp, ascending or descending? By default, hdfs dfs -ls command gives unsorted list of files.

When I searched for answers what I got was a workaround i.e. hdfs dfs -ls /tmp | sort -k6,7. But is there any better way, inbuilt in hdfs dfs commandline?

like image 427
PradeepKumbhar Avatar asked May 04 '16 08:05

PradeepKumbhar


People also ask

How do you list the files in HDFS directory?

Use the hdfs dfs -ls command to list files in Hadoop archives. Run the hdfs dfs -ls command by specifying the archive directory location. Note that the modified parent argument causes the files to be archived relative to /user/ .

Which command is used for display the list of files and directories in HDFS?

ls: List directories present under a specific directory in HDFS, similar to Unix ls command. The -lsr command can be used for recursive listing of directories and files.

Which HDFS command is used to list the content inside your directory?

Hadoop HDFS ls Command Description: The Hadoop fs shell command ls displays a list of the contents of a directory specified in the path provided by the user. It shows the name, permissions, owner, size, and modification date for each file or directories in the specified directory.

How do I recursively list files in HDFS?

Use -R followed by ls command to list files/directorires recursively. -d : Directories are listed as plain files. -h "Formats the sizes of files in a human-readable fashion rather than a number of bytes. -R "Recursively list the contents of directories.


1 Answers

No, there is no other option to sort the files based on datetime.
If you are using hadoop version < 2.7, you will have to use sort -k6,7 as you are doing:

hdfs dfs -ls /tmp | sort -k6,7 

And for hadoop 2.7.x ls command , there are following options available :

Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>  Options: -d: Directories are listed as plain files. -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864). -R: Recursively list subdirectories encountered. -t: Sort output by modification time (most recent first). -S: Sort output by file size. -r: Reverse the sort order. -u: Use access time rather than modification time for display and sorting. 

So you can easily sort the files:

hdfs dfs -ls -t -R (-r) /tmp  
like image 144
Nishu Tayal Avatar answered Sep 22 '22 19:09

Nishu Tayal