Is there a way to check duplicate images with different names using php ? I want to delete all duplicates.
Click the Method dropdown list, select Visual Compare. Click the Start Scan button. It will start scanning for both duplicate photos with different names and duplicate photos with the same names. In other words, it will find duplicate pictures based on visual content, regardless of the file name.
Go to www.images.google.com and click on the Camera icon in the search bar. You can next upload the image or paste the image URL in the search bar to search for similar images on the web. Click on tab “Search by image” once you have uploaded the image.
You can compare and check it by sha1_file hash of a file
It returns 40 character hex number
I suppose a somewhat simple solution would be to do a checksum on the images using md5()
.
Open a directory, loop through the files generating md5s, compare md5s, delete duplicates.
EDIT: Here's a script using hash_file()
<?php
$dir = "/full/path/to/images";
$checksums = array();
if ($h = opendir($dir)) {
while (($file = readdir($h)) !== false) {
// skip directories
if(is_dir($_="{$dir}/{$file}")) continue;
$hash = hash_file('md5', $_);
// delete duplicate
if (in_array($hash, $checksums)) {
unlink($_);
}
// add hash to list
else {
$checksums[] = $hash;
}
}
closedir($h);
}
I spent a lot of time looking for the best solution in php, but failed, read my 5 steps to heaven (or just get step #5).
hash_file does not work as desired, for example in a folder of 11000 pictures with different names I know that there are only 800 unique, hash_file () found only 30 matches.
I could not install a third-party library like http://libpuzzle.pureftpd.org/project/libpuzzle/php on Windows + Openserver.
Tried to compare by dominant color or pixel-by-pixel ImageColorAt()
, creating "digital stamp of image". It works very slow, manycoding and in final very bad - changing size or merge/rotate images are elusive.
Checked Github to find readytogo solution, but there are no any solution on PHP (why? It was surprise for me).
Finally, I found the shareware desktop program http://www.mindgems.com/products/VS-Duplicate-Image-Finder/VSDIF-Tutorials.htm?postinstall=1 which worked just super (fast! it works in multithreading and loads CPU to 100%, 8gb and 11000 images compared in just ~30 secs) and has all the necessary functions, exceptions and filtering. In those 11000 images dir this program founded all visual similar images, showing me groups and instances, allowing to move selected with autofilters and etc. The main disadvantage is money, but there are torrents ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With