Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Windows batch file to find duplicates in a tree

I need a batch file ( Windows CMD is the interpreter, a .bat ) to do this type of task:

1) Search through a folder and its subfolders

2) Find files with the same filename and extension ( aka duplicates )

3) Check if they have the same size

4) If same name + same size, echo all the files except the first one ( practically I need to delete all except one copy )

Thanks for any type of help

This is only an initial script, just for check the files, in a folder and its subfolders, and their size:

@Echo off
Setlocal EnableDelayedExpansion

Set Dir=C:\NewFolder

For /r "%Dir%" %%i in (*) do (
Set FileName=%%~nxi
Set FullPath=%%i
Set Size=%%~zi
Echo "!FullPath!" - SIZE: !Size!
)
Echo.
Pause
like image 379
Thummer Avatar asked Sep 27 '14 15:09

Thummer


2 Answers

@echo off

setlocal 

for /f "tokens=1 delims==" %%# in ('set _') do (
    set "%%#="
)

for /r %%a in (*.*) do (
    if not defined _%%~nxa%%~za (
        set "_%%~nxa%%~za=%%~fa"
    ) else (
        echo %%~fa
    )
)

endlocal
like image 42
npocmaka Avatar answered Sep 22 '22 20:09

npocmaka


This script does what you ask. Just set the ROOT variable at the top to point to the root of your tree.

@echo off
setlocal disableDelayedExpansion
set root="c:\test"
set "prevTest=none"
set "prevFile=none"
for /f "tokens=1-3 delims=:" %%A in (
  '"(for /r "%root%" %%F in (*) do @echo %%~znxF:%%~fF:)|sort"'
) do (
  set "currTest=%%A"
  set "currFile=%%B:%%C"
  setlocal enableDelayedExpansion
  if !currTest! equ !prevTest! echo "!currFile!"
  endlocal
  set "prevTest=%%A"
)

But you can make the test more precise by using FC to compare the contents of the files. Also, you can incorporate the DEL command directly in the script. The script below prints out the commands that would delete the duplicate files. Remove the ECHO before the DEL command when you are ready to actually delete the files.

@echo off
setlocal disableDelayedExpansion
set root="c:\test"

set "prevTest=none"
set "prevFile=none"
for /f "tokens=1-3 delims=:" %%A in (
  '"(for /r "%root%" %%F in (*) do @echo %%~znxF:%%~fF:)|sort"'
) do (
  set "currTest=%%A"
  set "currFile=%%B:%%C"
  setlocal enableDelayedExpansion
  set "match="
  if !currTest! equ !prevTest! fc /b "!prevFile!" "!currFile!" >nul && set match=1
  if defined match (
    echo del "!currFile!"
    endlocal
  ) else (
    endlocal
    set "prevTest=%%A"
    set "prevFile=%%B:%%C"
  )
)

Both sets of code may seem overly complicated, but it is only because I have structured the code to be robust and avoid problems that can plague simple solutions. For example, ! in file names can cause problems with FOR variables if delayed expansion is enabled, and = in file name causes a problem with npocmoka's solution.

like image 107
dbenham Avatar answered Sep 23 '22 20:09

dbenham