I would like to know is there any command/expression to get only the file name in hadoop. I need to fetch only the name of file, when I do <code>hadoop fs -ls</code> it prints the whole path. I tried below but just wondering if some better way to do it. <pre class="prettyprint"><code>hadoop fs -ls <HDFS_DIR>|cut -d ' ' -f17 </code></pre>

The following command will return filenames only: <pre class="prettyprint"><code>hdfs dfs -stat "%n" my/path/* </code></pre> :added at Feb 04 '21 Actually last few years I use <code>hdfs dfs -ls -d my/path/* | awk '{print $8}'</code> and <code>hdfs dfs -ls my/path | grep -e "^-" | awk '{print $8}'</code>

It seems hadoop ls does not support any options to output just the filenames, or even just the last column. If you want get the last column reliably, you should first convert the whitespace to a single space, so that you can then address the last column: <pre class="prettyprint"><code>hadoop fs -ls | sed '1d;s/ */ /g' | cut -d\ -f8 </code></pre> This will get you just the last column but files with the whole path. If you want just filenames, you can use basename as @rojomoke suggests: <pre class="prettyprint"><code>hadoop fs -ls | sed '1d;s/ */ /g' | cut -d\ -f8 | xargs -n 1 basename </code></pre> I also filtered out the first line that says <code>Found ?x items</code> Note: beware that, as @felix-frank notes in the comments, that the above command will not correctly preserve file names with multiple consecutive spaces. Hence a more correct solution proposed by Felix: <code>hadoop fs -ls /tmp | sed 1d | perl -wlne'print +(split " ",$_,8)[7]'</code>

How to list only the file names in HDFS

Tags:

shell

hadoop

I would like to know is there any command/expression to get only the file name in hadoop. I need to fetch only the name of file, when I do hadoop fs -ls it prints the whole path.

I tried below but just wondering if some better way to do it.

hadoop fs -ls <HDFS_DIR>|cut -d ' ' -f17

911

asked Feb 05 '14 05:02

Navneet Kumar

2 Answers

The following command will return filenames only:

hdfs dfs -stat "%n" my/path/*

:added at Feb 04 '21

Actually last few years I use

hdfs dfs -ls -d my/path/* | awk '{print $8}'

and

hdfs dfs -ls my/path | grep -e "^-" | awk '{print $8}'

answered Sep 27 '22 20:09

MichealKum

It seems hadoop ls does not support any options to output just the filenames, or even just the last column.

If you want get the last column reliably, you should first convert the whitespace to a single space, so that you can then address the last column:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8

This will get you just the last column but files with the whole path. If you want just filenames, you can use basename as @rojomoke suggests:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8 | xargs -n 1 basename

I also filtered out the first line that says Found ?x items

Note: beware that, as @felix-frank notes in the comments, that the above command will not correctly preserve file names with multiple consecutive spaces. Hence a more correct solution proposed by Felix:

hadoop fs -ls /tmp | sed 1d | perl -wlne'print +(split " ",$_,8)[7]'

answered Sep 27 '22 19:09

Jakub Kotowski

Related questions
                            
                                How can I create and open a file from terminal with a single command?
                            
                                How do I move a relative symbolic link?
                            
                                How do I insert a newline/linebreak after a line using sed
                            
                                mkdir -p fails when directory exists
                            
                                Bash: echo string that starts with "-"
                            
                                List only directories names which match a pattern
                            
                                Check return status of psql command in unix shell scripting
                            
                                How to disable the auto comment in shell script vi editing?
                            
                                Is there a better Windows command-line shell? [closed]
                            
                                Check if file exists and whether it contains a specific string
                            
                                Invoking program when a bash function has the same name
                            
                                Use the contents of a file to replace a string using SED
                            
                                How to exit a shell script if targeted file doesn't exist
                            
                                Raise to the power in shell
                            
                                How do I recursively list all directories at a location, breadth-first?
                            
                                When to use set -e
                            
                                remove all of a file type from a directory and its children
                            
                                mongo shell script won't let me include "use <database>"
                            
                                Assign grep count to variable
                            
                                changing permission for files and folder recursively using shell command in mac

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With