Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the probability of md5 collision if I pass in 2^32 sets of string?

Tags:

md5

What is the probability of md5 collision if I pass in 2^32 sets of string?

Can I say the answer is just 2^32/2^128 = 1/1.2621774e-29 as the length of bit of md5 hash is 128?

like image 484
Harold Chan Avatar asked Feb 20 '13 05:02

Harold Chan


1 Answers

This question is similar to the so-called "birthday paradox".

In probability theory, the birthday problem or birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are 366 possible birthdays, including February 29). However, 99% probability is reached with just 57 people, and 50% probability with 23 people. These conclusions are based on the assumption that each day of the year (except February 29) is equally probable for a birthday.

The mathematics behind this problem led to a well-known cryptographic attack called the birthday attack, which uses this probabilistic model to reduce the complexity of cracking a hash function.

According to the Wikipedia article, the chance of a collision when choosing n = 232 random numbers from a space with d = 2128 numbers is approximately:

Generalized birthday problem math

If you work this calculation out the chance is about 2.7×10-20. This is a very small probability, but notice that it is 9 orders of magnitude higher than your proposed calculation.

like image 116
John Kugelman Avatar answered Sep 20 '22 17:09

John Kugelman