Get the last updated file in HDFS

Tags:

I want the latest updated file from one of my HDFS directories. The code should basically loop through the directories and sub directories and the get the latest file path with the file name.I was able to get the latest file in local file system but not sure how to do it for HDFS one.

find /tmp/sdsa -type f -print0 | xargs -0 stat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head

The above code is working for local file system. I am able to get the date , time and file name from HDFS, but how do I get the latest file using these 3 parameters?

this is the code I tried:

hadoop fs -ls -R /tmp/apps | awk -F" " '{print $6" "$7" "$8}'

Any help will be appreciated.

Thanks in advance.

206

asked Jan 09 '16 01:01

Neethu Lalitha

1 Answers

This one worked for me:

hadoop fs -ls -R /tmp/app | awk -F" " '{print $6" "$7" "$8}' | sort -nr | head -1 | cut -d" " -f3

The output is the entire file path.

189

answered Sep 17 '22 17:09

Neethu Lalitha

Related questions
                            
                                #!/usr/bin/env ruby is not found in cron
                            
                                Retrieve exit status from php script inside of shell script
                            
                                Running script with HERE_DOC method in background
                            
                                How do I check whether a file or file directory exist in bash?
                            
                                Bash - Copy files from directory
                            
                                Why can't environment variables with dashes be accessed in bash 4.1.2?
                            
                                bash script to edit xml file
                            
                                How to make shell output redirect (>) write while script is still running?
                            
                                Linux command to run script at intervals
                            
                                Can I use an alias to execute a program from a python script
                            
                                How to shield the kill output [duplicate]
                            
                                Bash script not understood by Ubuntu Bash
                            
                                In which file is the Ubuntu system path set? [closed]
                            
                                How to remove duplicates from a file and write to the same file?
                            
                                bash script read pipe or argument
                            
                                Set and export multiple environment variables to the same value in bash
                            
                                redirection of ./a.out is not capturing segmentation fault
                            
                                How to get Bash version number in OS X
                            
                                Add index column to CSV file
                            
                                how to "docker run" a shell session on a minimal linux install and immediately tear down the container?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get the last updated file in HDFS

Tags:

bash

shell

unix

hadoop

Neethu Lalitha

People also ask

1 Answers

Neethu Lalitha

Recent Activity

Donate For Us