Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to check if 2 files are equal using .NET? [duplicate]

Tags:

c#

say i have a file A.doc.
then i copy it to b.doc and move it to another directory.
for me, it is still the same file.
but how can i determine that it is?
when i download files i sometimes read about getting the mda5 something or the checksum, but i don't know what that is about.

Is there a way to check whether these files are binary equal?

like image 844
Michel Avatar asked Mar 02 '10 08:03

Michel


People also ask

How do you check if two files are exactly the same?

Probably the easiest way to compare two files is to use the diff command. The output will show you the differences between the two files. The < and > signs indicate whether the extra lines are in the first (<) or second (>) file provided as arguments.


3 Answers

If you want to be 100% sure of the exact bytes in the file being the same, then opening two streams and comparing each byte of the files is the only way.

If you just want to be pretty sure (99.9999%?), I would calculate a MD5 hash of each file and compare the hashes instead. Check out System.Security.Cryptography.MD5CryptoServiceProvider.

In my testing, if the files are usually equivalent then comparing MD5 hashes is about three times faster than comparing each byte of the file.
If the files are usually different then comparing byte-by-byte will be much faster, because you don't have to read in the whole file, you can stop as soon as a single byte differs.

Edit: I originally based this answer off a quick test which read from each file byte-by-byte, and compared them byte-by-byte. I falsely assumed that the buffered nature of the System.IO.FileStream would save me from worrying about hard disk block sizes and read speeds; this was not true. I retested my program that reads from each file in 4096 byte chunks and then compares the chunks - this method is slightly faster overall than MD5 even when the files are exactly the same, and will of course be much faster if they differ.

I'm leaving this answer as a mild warning about the FileStream class, and because I still thinkit has some value as an answer to "how do I calculate the MD5 of a file in .NET". Apart from that though, it's not the best way to fulfill the original request.

example of calculating the MD5 hashes of two files (now tested!):

using (var reader1 = new System.IO.FileStream(filepath1, System.IO.FileMode.Open, System.IO.FileAccess.Read))
{
    using (var reader2 = new System.IO.FileStream(filepath2, System.IO.FileMode.Open, System.IO.FileAccess.Read))
    {
        byte[] hash1;
        byte[] hash2;

        using (var md51 = new System.Security.Cryptography.MD5CryptoServiceProvider())
        {
            md51.ComputeHash(reader1);
            hash1 = md51.Hash;
        }

        using (var md52 = new System.Security.Cryptography.MD5CryptoServiceProvider())
        {
            md52.ComputeHash(reader2);
            hash2 = md52.Hash;
        }

        int j = 0;
        for (j = 0; j < hash1.Length; j++)
        {
            if (hash1[j] != hash2[j])
            {
                break;
            }
        }

        if (j == hash1.Length)
        {
            Console.WriteLine("The files were equal.");
        }
        else
        {
            Console.WriteLine("The files were not equal.");
        }
    }
}
like image 66
Coxy Avatar answered Oct 17 '22 14:10

Coxy


First compare the size of the files , if the size is not the same then the files are different , if the size is the same , then simply compare the files content.

like image 42
user88637 Avatar answered Oct 17 '22 14:10

user88637


Indeed there is. Open both files, read them in as byte arrays, and compare each byte. If they are equal, then the file is equal.

like image 2
Noon Silk Avatar answered Oct 17 '22 14:10

Noon Silk