Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Is MD5 generation dependent on file size?

Is there any efficiency analysis of how MD5 dependent on the file size. Is it actually dependent of file size or content of the file. So for i have 500mb file with all blank spaces and a 500mb file with movie in it, would md5 take same time to generate the the hash code?

like image 459
Kazoom Avatar asked Dec 07 '22 06:12

Kazoom


2 Answers

Any hashsum is, by definition, a mathematical sum of the bytes of what you're summing. You have to read the file through a stream at the very least - more bytes take longer to traverse. However, I'd say (generally speaking) the bottleneck will indeed be reading the file, no matter what you're trying to with it - not hashing it once you've read it.

Edit: I kinda misread the question. It will take exactly the same amount of time to hash two files of equal size. 500mb of spaces is 500mb of bytes which represent "space". That's still 8 bits of data per byte, same as any other file.

like image 112
Rex M Avatar answered Dec 17 '22 03:12

Rex M


Because MD5 consists mostly of XOR, AND, OR and NOT operations, the speed is not dependent on a given bit containing a 1 or a 0.


From http://en.wikipedia.org/wiki/MD5:

There are four possible functions F; a different one is used in each round:

Source: http://upload.wikimedia.org/math/c/8/8/c887dfd80049b04ba54abfed7a04bda2.png
Source: http://upload.wikimedia.org/math/e/f/9/ef971bcd2ed5aeb59d6de12bcec32491.png
Source: http://upload.wikimedia.org/math/6/b/2/6b2e2f185f30889f1e37afe9ce29a096.png
Source: http://upload.wikimedia.org/math/c/8/8/c887dfd80049b04ba54abfed7a04bda2.png

Source: http://upload.wikimedia.org/math/d/9/6/d96277da48b2e8f86c7268f480a9e87c.png denote the XOR, AND, OR and NOT operations respectively.

like image 34
gahooa Avatar answered Dec 17 '22 05:12

gahooa