What is the best practice for printing a top 10 list of largest files in a POSIX shell? There has to be something more elegant than my current solution:
DIR="."
N=10
LIMIT=512000
find $DIR -type f -size +"${LIMIT}k" -exec du {} \; | sort -nr | head -$N | perl -p -e 's/^\d+\s+//' | xargs -I {} du -h {}
where LIMIT is a file size threshold to limit the results of find.
Listing Files In Size Order Using the ls Command in Linux To list the directory contents in descending file size order, use the ls command along with the -IS argument. You will see the larger files at the top of the list descending to the smallest files at the bottom.
du command -h option : Display sizes in human readable format (e.g., 1K, 234M, 2G). du command -s option : It shows only a total for each argument (summary). du command -x option : Skip directories on different file systems. sort command -r option : Reverse the result of comparisons.
To list all files and sort them by size, use the -S option. By default, it displays output in descending order (biggest to smallest in size). You can output the file sizes in human-readable format by adding the -h option as shown. And to sort in reverse order, add the -r flag as follows.
Edit:
Using Gnu utilities (du
and sort
):
du -0h | sort -zrh | tr '\0' '\n'
This uses a null delimiter to pass information between du
and sort
and uses tr
to convert the nulls to newlines. The nulls allow this pipeline to process filenames which may include newlines. Both -h
options cause the output to be in human-readable form.
Original:
This uses awk
to create extra columns for sort keys. It only calls du
once. The output should look exactly like du
.
I've split it into multiple lines, but it can be recombined into a one-liner.
du -h |
awk '{printf "%s %08.2f\t%s\n",
index("KMG", substr($1, length($1))),
substr($1, 0, length($1)-1), $0}' |
sort -r | cut -f2,3
Explanation:
Try it without the cut
command to see what it's doing.
Edit:
Here's a version which does the sorting within the AWK script and doesn't need cut (requires GNU AWK (gawk
) for asorti
support):
du -h0 |
gawk 'BEGIN {RS = "\0"}
{idx = sprintf("%s %08.2f %s",
index("KMG", substr($1, length($1))),
substr($1, 0, length($1)-1), $0);
lines[idx] = $0}
END {c = asorti(lines, sorted);
for (i = c; i >= 1; i--)
print lines[sorted[i]]}'
Edit: Added null record separation in order to handle potential filenames which include newlines. Requires GNU du
and gawk
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With