What is the fastest way to create a hash function which will be used to check if two files are equal?
Security is not very important.
Edit: I am sending a file over a network connection, and will be sure that the file on both sides are equal
SHA-1 is fastest hashing function with ~587.9 ms per 1M operations for short strings and 881.7 ms per 1M for longer strings. MD5 is 7.6% slower than SHA-1 for short strings and 1.3% for longer strings.
Generally, two files can have the same md5 hash only if their contents are exactly the same. Even a single bit of variation will generate a completely different hash value.
The MD5 algorithm produces a 128-bit output, which is expressed as a 32 characters hexadecimal. The SHA-256 algorithm is twice longer, with 64 hexadecimal characters for 256-bits.
The current strongest encryption algorithms are SHA-512, RIPEMD-320, and Whirlpool. Any one of these algorithms are worthy of protecting top secret level information for your business. Cracked?
Unless you're using a really complicated and/or slow hash, loading the data from the disk is going to take much longer than computing the hash (unless you use RAM disks or top-end SSDs).
So to compare two files, use this algorithm:
This allows for a fast fail (if the sizes are different, you know that the files are different).
To make things even faster, you can compute the hash once and save it along with the file. Also save the file date and size into this extra file, so you know quickly when you have to recompute the hash or delete the hash file when the main file changes.
One approach might be to use a simple CRC-32 algorithm, and only if the CRC values compare equal, rerun the hash with a SHA1 or something more robust. A fast CRC-32 will outperform a cryptographically secure hash any day.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With