I have several thousand duplicate files (jar files as an example) that I'd like to use powershell to
I'm new to powershell and am throwing this out there to the PS folks that might be able to help.
This can be achieve by using the Sort-Object and the Import-CSV cmdlet to remove duplicates from a CSV file. After the contents of the CSV file sorted using Sort-Object, you can use the unique switch to return only unique rows from the file.
In Windows, you can delete duplicate files in two ways: manually or using duplicate file removal software.
try this:
ls *.txt -recurse | get-filehash | group -property hash | where { $_.count -gt 1 } | % { $_.group | select -skip 1 } | del
from: http://n3wjack.net/2015/04/06/find-and-delete-duplicate-files-with-just-powershell/
Keep a dictionary of files, delete when the next file name was already encountered before:
$dict = @{};
dir c:\admin -Recurse | foreach {
$key = $_.Name #replace this with your checksum function
$find = $dict[$key];
if($find -ne $null) {
#current file is a duplicate
#Remove-Item -Path $_.FullName ?
}
$dict[$key] = 0; #dummy placeholder to save memory
}
I used file name as a key, but you can use a checksum if you want (or both) - see code comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With