Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if same image has already been uploaded by comparing BASE64?

My question is concerning an idea I had, where I could check if an image has already been uploaded by comparing their base64-encoded strings...

Example use-case would be to find duplicates in your database...

The operation would be pretty big i guess - first converting the image to base64and then using something like "strcmp()" to compare..

Not sure if this would make a lot of sense but what do you think of the idea?

Would it be too big of an operation? How accurate would it be? Does the idea make any sense?

like image 331
der-lukas Avatar asked Oct 24 '25 15:10

der-lukas


1 Answers

Here's a function that can help you compare files faster.

Aside from checking an obvious thing like file size, you can play more with comparing binary chunks.
For example, check the last n bytes as well as a chunk of a random offset.

I used checksum comparison as a last resort.

When optimizing check order, you can also take into account if you are generally expecting files to be different or not.

function areEqual($firstPath, $secondPath, $chunkSize = 500){

    // First check if file are not the same size as the fastest method
    if(filesize($firstPath) !== filesize($secondPath)){
        return false;
    }

    // Compare the first ${chunkSize} bytes
    // This is fast and binary files will most likely be different 
    $fp1 = fopen($firstPath, 'r');
    $fp2 = fopen($secondPath, 'r');
    $chunksAreEqual = fread($fp1, $chunkSize) == fread($fp2, $chunkSize);
    fclose($fp1);
    fclose($fp2);

    if(!$chunksAreEqual){
        return false;
    }

    // Compare hashes
    // SHA1 calculates a bit faster than MD5
    $firstChecksum = sha1_file($firstPath);
    $secondChecksum = sha1_file($secondPath);
    if($firstChecksum != $secondChecksum){
        return false;
    }

    return true;
}
like image 98
Ivan Batić Avatar answered Oct 26 '25 05:10

Ivan Batić