Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Powershell to display duplicate files

Tags:

powershell

I have a task to check if new files are imported for the day in a shared location folder and alert if any duplicate files and no recursive check needed.

Below code displays all the file details with size which are 1 day old However I need only files with the same size as I cannot compare them using name.

$Files = Get-ChildItem -Path E:\Script\test |
Where-Object {$_.CreationTime -gt (Get-Date).AddDays(-1)}

$Files | Select-Object -Property Name, hash, LastWriteTime, @{N='SizeInKb';E={[double]('{0:N2}' -f ($_.Length/1kb))}}
like image 359
Teja554 Avatar asked May 07 '26 14:05

Teja554


1 Answers

I didn't like the big DOS-like script answer written here, so here's an idiomatic way of doing it for Powershell:

From the folder you want to find the duplicates, just run this simple set of pipes

Get-ChildItem -Recurse -File `
| Group-Object -Property Length `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group } `
| Get-FileHash `
| Group-Object -Property Hash `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group }

Which will show all files and their hashes that match other files.
Each line does the following:

  • get files
    • from current directory (use -Path $directory otherwise)
    • recursively (if not wanted, remove -Recurse)
  • group based on file size
  • discard groups with less than 2 files
  • grab all those files
  • get hashes for each
  • group based on hash
  • discard groups with less than 2 files
  • get all those files

Add | %{ $_.path } to just show the paths instead of the hashes.
Add | %{ $_.path -replace "$([regex]::escape($(pwd)))",'' } to only show the relative path from the current directory (useful in recursion).

For the question-asker specifically, don't forget to whack in | Where-Object {$_.CreationTime -gt (Get-Date).AddDays(-1)} after the gci so you're not comparing files you don't want to consider, which might get very time-consuming if you have a lot of coincidentally same-length files in that shared folder.

Finally, if you're like me and just wanted to find dupes based on name, as google will probably take you here too:

gci -Recurse -file | Group-Object name | Where-Object { $_.Count -gt 1 } | select -ExpandProperty group | %{ $_.fullname }

like image 88
Hashbrown Avatar answered May 09 '26 02:05

Hashbrown



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!