Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking duplicates in terminal?

The following code prints me a long list of files with hashes and file names

md5sum *.java

I have tried unsuccessfully to list the lines where identical hashes occur, so that I can then remove identical files.

How can you filter and delete identical files which have same content?

like image 648
Léo Léopold Hertz 준영 Avatar asked Oct 29 '25 03:10

Léo Léopold Hertz 준영


1 Answers

fdupes and less view on duplicates

Use fdupes which is a commandline program such as

fdupes -r /home/masi/Documents/ > /tmp/1 
less -M +Gg /tmp/1

which finds all duplicates and stores them in file in temp. The less command shows you the line position of all lines and your proceeding as percentage. I found fdupes from this answer and its clear Wikipedia article here. You can install it by homebrew in OSX and by apt-get in Linux.

Use fdupes interactively with possible deletes

Run

fdupes -rd /home/masi/Documents

which let's you choose which copy to delete or not, example view of the interactive work:

Set 4 of 2664, preserve files [1 - 2, all]: all

   [+] /home/masi/Documents/Exercise 10 - 1.4.2015/task.bib
   [+] /home/masi/Documents/Exercise 9 - 16.3.2015/task.bib

[1] /home/masi/Documents/Celiac_disease/jcom_jun02_celiac.pdf
[2] /home/masi/Documents/turnerWhite/jcom_jun02_celiac.pdf

Set 5 of 2664, preserve files [1 - 2, all]: 2

   [-] /home/masi/Documents/Celiac_disease/jcom_jun02_celiac.pdf
   [+] /home/masi/Documents/turnerWhite/jcom_jun02_celiac.pdf

where you see that I have 2664 duplicates. It would be nice to have some static file which would save the settings about my wanted duplicates; I opened a thread about this here. For instance, I have same bib -files in some exercises and homework so do not ask second time when the user wants the duplicate.

like image 80
5 revsLéo Léopold Hertz 준영 Avatar answered Oct 31 '25 20:10

5 revsLéo Léopold Hertz 준영