I've written a simple shell script that finds large files, mostly to save myself some typing. The work is being done with:
find $dir -type f -size +"$size"M -printf '%s %p\n' | sort -rn
I'd like to turn the byte output into a human readable format. I found ways online on how to manually do this, e.g.,
find $dir -type f -size +"$size"M -printf '%s %p\n' | sort -rn |
awk '{ hum[1024**4]="TB"; hum[1024**3]="GB"; hum[1024**2]="MB"; hum[1024]="KB"; hum[0]="B";
for (x=1024**4; x>=1024; x/=1024){
if ($1>=x) { printf "%7.2f %s\t%s\n",$1/x,hum[x],$2;break }
}}'
But this seems messy. I was wondering: is there was a standard way to convert bytes into a human-readable form?
Of course, any alternate methods of producing the below output, given a directory and min-size as input, are also welcome:
1.25 GB /foo/barf
598.80 MB /foo/bar/bazf
500.58 MB /bar/bazf
421.70 MB /bar/baz/bamf
...
Note: This must work on both 2.4 and 2.6, and the output should be sorted.
Use du -h
and sort -h
find /your/dir -type f -size +5M -exec du -h '{}' + | sort -hr
Explanations:
du -h file1 file2 ...
prints the disk usage in human readable format of the given files.sort -hr
sorts human readable numbers in reverse order (larger numbers first). +
of find -exec
will reduce the number of invocations of command du
and therefore will speed up the execution. Here +
can be replaced by ';'
.You can remove option -r
of sort
command if you want the larger files being printed at the end. You can even use the simpler following command, but your terminal window buffer may be filled!
find /your/dir -type f -exec du -h '{}' + | sort -h
Or if you want just the top ten larger files:
find /your/dir -type f -exec du -h '{}' + | sort -hr | head
Note: option -h
of sort
has been introduced in about 2009, therefore this option may not be available on old distro (as Red Hat 5). Moreover the option +
of find -exec
is not available either on older distro (as Red Hat 4).
On old distro, you can use xargs
instead of option +
of find -exec
. The command ls
may also be used to print sorted files. But to guarantee the sorting by size, xargs
must invoke ls
only once. xargs
can invoke ls
only once if your amount of files is acceptable: it depends on the text length passed to ls
argument (sum of all filenames length).
find /your/dir -type f -size +5M -print0 | xargs -0 ls -1Ssh
(with a little inspiration borrowed from MichaelKrelin-hacker).
Explanations:
ls -1
displays one file per linels -S
sorts by file sizels -s
prints the file sizels -h
prints sizes in human readable formatThe fastest command may be using the above ls -1Ssh
with the +
option of find -exec
but as above the amount of files must be acceptable to invoke ls
only once in order to guarantee the sorting by size (option +
of find -exec
works in much the same way as xargs
).
find /your/dir -type f -size +5M -exec ls -1Ssh '{}' +
To reduce the amount of files found, you can increase the threshold size: replace +5M
by +100M
for instance.
find ... | sort -rn | cut -d\ -f2 | xargs df -h
for instance :) or
find $dir -type -f size +$size -print0 | xargs -0 ls -1hsS
(with a little inspiration borrowed from olibre).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With