Is there any efficiency analysis of how MD5 dependent on the file size. Is it actually dependent of file size or content of the file. So for i have 500mb file with all blank spaces and a 500mb file with movie in it, would md5 take same time to generate the the hash code?
Any hashsum is, by definition, a mathematical sum of the bytes of what you're summing. You have to read the file through a stream at the very least - more bytes take longer to traverse. However, I'd say (generally speaking) the bottleneck will indeed be reading the file, no matter what you're trying to with it - not hashing it once you've read it.
Edit: I kinda misread the question. It will take exactly the same amount of time to hash two files of equal size. 500mb of spaces is 500mb of bytes which represent "space". That's still 8 bits of data per byte, same as any other file.
Because MD5 consists mostly of XOR, AND, OR and NOT operations, the speed is not dependent on a given bit containing a 1 or a 0.
From http://en.wikipedia.org/wiki/MD5:
There are four possible functions F; a different one is used in each round:
denote the XOR, AND, OR and NOT operations respectively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With