I have a file structure like so
a/file1
a/file2
a/file3
a/...
b/file1
b/file2
b/file3
b/...
...
where within each dir, some files have the same file size, and I would like to delete those.
I guess if the problem could be solved for one dir e.g. dir a
, then I could wrap a for-loop around it?
for f in *; do
???
done
But how do I find files with same size?
ls -l|grep '^-'|awk '{if(a[$5]){ a[$5]=a[$5]"\n"$NF; b[$5]++;} else a[$5]=$NF} END{for(x in b)print a[x];}'
this will only check files, no directories.
$5 is the size of ls command
test:
kent@ArchT60:/tmp/t$ ls -l
total 16
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 a
-rw-r--r-- 1 kent kent 153 Sep 24 22:24 all
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 b
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 c
kent@ArchT60:/tmp/t$ ls -l|grep '^-'|awk '{if(a[$5]){ a[$5]=a[$5]"\n"$NF; b[$5]++;} else a[$5]=$NF} END{for(x in b)print a[x];}'
a
b
c
kent@ArchT60:/tmp/t$
update based on Michał Šrajer 's comment:
Now filenames with spaces are also supported
command:
ls -l|grep '^-'|awk '{ f=""; if(NF>9)for(i=9;i<=NF;i++)f=f?f" "$i:$i; else f=$9;
if(a[$5]){ a[$5]=a[$5]"\n"f; b[$5]++;} else a[$5]=f}END{for(x in b)print a[x];}'
test:
kent@ArchT60:/tmp/t$ l
total 24
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 a
-rw-r--r-- 1 kent kent 153 Sep 24 22:24 all
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 b
-rw-r--r-- 1 kent kent 51 Sep 24 22:23 c
-rw-r--r-- 1 kent kent 51 Sep 24 22:40 x y
kent@ArchT60:/tmp/t$ ls -l|grep '^-'|awk '{ f=""
if(NF>9)for(i=9;i<=NF;i++)f=f?f" "$i:$i; else f=$9;
if(a[$5]){ a[$5]=a[$5]"\n"f; b[$5]++;} else a[$5]=f} END{for(x in b)print a[x];}'
a
b
c
x y
kent@ArchT60:/tmp/t$
Solution working with "file names with spaces" (based on Kent (+1) and awiebe (+1) posts):
for FILE in *; do stat -c"%s/%n" "$FILE"; done | awk -F/ '{if ($1 in a)print $2; else a[$1]=1}' | xargs echo rm
to make it remove duplicates, remove echo
from xargs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With