What is the probability of md5 collision if I pass in 2^32 sets of string?
Can I say the answer is just 2^32/2^128 = 1/1.2621774e-29 as the length of bit of md5 hash is 128?
This question is similar to the so-called "birthday paradox".
In probability theory, the birthday problem or birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are 366 possible birthdays, including February 29). However, 99% probability is reached with just 57 people, and 50% probability with 23 people. These conclusions are based on the assumption that each day of the year (except February 29) is equally probable for a birthday.
The mathematics behind this problem led to a well-known cryptographic attack called the birthday attack, which uses this probabilistic model to reduce the complexity of cracking a hash function.
According to the Wikipedia article, the chance of a collision when choosing n = 232 random numbers from a space with d = 2128 numbers is approximately:
If you work this calculation out the chance is about 2.7×10-20. This is a very small probability, but notice that it is 9 orders of magnitude higher than your proposed calculation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With