Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find files with same size?

Tags:

linux

bash

awk

I have a file structure like so

a/file1
a/file2
a/file3
a/...
b/file1
b/file2
b/file3
b/...
...

where within each dir, some files have the same file size, and I would like to delete those.

I guess if the problem could be solved for one dir e.g. dir a, then I could wrap a for-loop around it?

for f in *; do
???
done

But how do I find files with same size?

like image 946
Sandra Schlichting Avatar asked Sep 24 '11 20:09

Sandra Schlichting


2 Answers

 ls -l|grep '^-'|awk '{if(a[$5]){ a[$5]=a[$5]"\n"$NF; b[$5]++;} else a[$5]=$NF} END{for(x in b)print a[x];}'

this will only check files, no directories.

$5 is the size of ls command

test:

kent@ArchT60:/tmp/t$ ls -l
total 16
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 a
-rw-r--r-- 1 kent kent 153 Sep 24 22:24 all
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 b
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 c
kent@ArchT60:/tmp/t$ ls -l|grep '^-'|awk '{if(a[$5]){ a[$5]=a[$5]"\n"$NF; b[$5]++;} else a[$5]=$NF} END{for(x in b)print a[x];}'
a
b
c
kent@ArchT60:/tmp/t$ 

update based on Michał Šrajer 's comment:

Now filenames with spaces are also supported

command:

 ls -l|grep '^-'|awk '{ f=""; if(NF>9)for(i=9;i<=NF;i++)f=f?f" "$i:$i; else f=$9; 
        if(a[$5]){ a[$5]=a[$5]"\n"f; b[$5]++;} else a[$5]=f}END{for(x in b)print a[x];}'

test:

kent@ArchT60:/tmp/t$ l
total 24
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 a
-rw-r--r-- 1 kent kent 153 Sep 24 22:24 all
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 b
-rw-r--r-- 1 kent kent  51 Sep 24 22:23 c
-rw-r--r-- 1 kent kent  51 Sep 24 22:40 x y

kent@ArchT60:/tmp/t$ ls -l|grep '^-'|awk '{ f=""
        if(NF>9)for(i=9;i<=NF;i++)f=f?f" "$i:$i; else f=$9; 
        if(a[$5]){ a[$5]=a[$5]"\n"f; b[$5]++;} else a[$5]=f} END{for(x in b)print a[x];}'
a
b
c
x y

kent@ArchT60:/tmp/t$
like image 90
Kent Avatar answered Sep 22 '22 03:09

Kent


Solution working with "file names with spaces" (based on Kent (+1) and awiebe (+1) posts):

for FILE in *; do stat -c"%s/%n" "$FILE"; done | awk -F/ '{if ($1 in a)print $2; else a[$1]=1}' | xargs echo rm

to make it remove duplicates, remove echo from xargs.

like image 38
Michał Šrajer Avatar answered Sep 21 '22 03:09

Michał Šrajer