Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find duplicate files with same name but in different case that exist in same directory in Linux?

How can I return a list of files that are named duplicates i.e. have same name but in different case that exist in the same directory?

I don't care about the contents of the files. I just need to know the location and name of any files that have a duplicate of the same name.

Example duplicates:

/www/images/taxi.jpg /www/images/Taxi.jpg 

Ideally I need to search all files recursively from a base directory. In above example it was /www/

like image 997
Camsoft Avatar asked Jan 21 '10 12:01

Camsoft


People also ask

How do I find duplicate files in two folders?

To start your duplicate search, go to File -> Find Duplicates or click the Find Duplicates button on the main toolbar. The Find Duplicates dialog will open, as shown below. The Find Duplicates dialog is intuitive and easy to use. The first step is to specify which folders should be searched for duplicates.

How do you check two files are same or not in Linux?

Probably the easiest way to compare two files is to use the diff command. The output will show you the differences between the two files. The < and > signs indicate whether the extra lines are in the first (<) or second (>) file provided as arguments.

Can you have same file name in different folders?

Answer: Duplicate file names cannot exist in the same folder...the new file will overwrite (replace) the old one. If you do not wish to rename your pictures... try saving the new files in a different folder.


1 Answers

The other answer is great, but instead of the "rather monstrous" perl script i suggest

perl -pe 's!([^/]+)$!lc $1!e' 

Which will lowercase just the filename part of the path.

Edit 1: In fact the entire problem can be solved with:

find . | perl -ne 's!([^/]+)$!lc $1!e; print if 1 == $seen{$_}++' 

Edit 3: I found a solution using sed, sort and uniq that also will print out the duplicates, but it only works if there are no whitespaces in filenames:

find . |sed 's,\(.*\)/\(.*\)$,\1/\2\t\1/\L\2,'|sort|uniq -D -f 1|cut -f 1 

Edit 2: And here is a longer script that will print out the names, it takes a list of paths on stdin, as given by find. Not so elegant, but still:

#!/usr/bin/perl -w  use strict; use warnings;  my %dup_series_per_dir; while (<>) {     my ($dir, $file) = m!(.*/)?([^/]+?)$!;     push @{$dup_series_per_dir{$dir||'./'}{lc $file}}, $file; }  for my $dir (sort keys %dup_series_per_dir) {     my @all_dup_series_in_dir = grep { @{$_} > 1 } values %{$dup_series_per_dir{$dir}};     for my $one_dup_series (@all_dup_series_in_dir) {         print "$dir\{" . join(',', sort @{$one_dup_series}) . "}\n";     } } 
like image 62
Christoffer Hammarström Avatar answered Sep 30 '22 03:09

Christoffer Hammarström