Can we say that a truncated md5
hash is still uniformly distributed?
To avoid misinterpretations: I'm aware the chance of collisions is much greater the moment you start to hack off parts from the md5
result; my use-case is actually interested in deliberate collisions. I'm also aware there are other hash methods that may be better suited to use-cases of a shorter hash (including, in fact, my own), and I'm definitely looking into those.
But I'd also really like to know whether md5
's uniform distribution also applies to chunks of it. (Consider it a burning curiosity.)
Since mediawiki uses it (specifically, the left-most two hex-digits as characters of the result) to generate filepaths for images (e.g. /4/42/The-image-name-here.png
) and they're probably also interested in an at least near-uniform distribution, I imagine the answer is 'yes', but I don't actually know.
Yes, not exhibiting any bias is a design requirement for a cryptographic hash. MD5 is broken from a cryptographic point of view however the distribution of the results was never in question.
If you still need to be convinced, it's not a huge undertaking to hash a bunch of files, truncate the output and use ent ( http://www.fourmilab.ch/random/ ) to analyze the result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With