Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you show me two actual, non-trivial strings that produce the same MD5 or SHA1 hash?

...and if not, why not?

So here's the question behind the question.

I understand that the likelihood of accidental collisions in MD5 and SHA1 is small (though less likely in SHA1 than in MD5). I also understand that deliberate collisions are theoretically possible. Is it practically possible? Could I go through some process to deliberately generate two messages with the same hash, in either of these algorithms? What process would I go through?

like image 436
OldPeculier Avatar asked Aug 19 '11 17:08

OldPeculier


1 Answers

Collisions necessarily exist for a given hash function, in a mathematical sense: there are more possible inputs than possible outputs, so there must be two inputs which map to the same output. Now proving the existence of a collision, and actually finding one, are two different things. If I drop a diamond in the middle of the ocean, I positively know that there is now a diamond somewhere in the ocean -- but I am quite at a loss if I want to recover it.

For a "generic" hash function with an output of n bits, there are generic methods to find a collision, with average cost 2n/2 evaluations of the function (see this page). Depending on n, this can range from the easy to the totally unfeasible. MD5 has an output of 128 bits, and 264 is "quite high": you can do it, but it will require a few thousands of machines and months of computations.

Now there are known weaknesses in MD5, i.e. some internal structure which can be exploited to produce collisions much more easily. Best attack on MD5 known so far requires a bit less than 221 function invocations, so this is a matter of a few seconds (at most) on a basic PC. @Omri points in his response to a great example of an MD5 collision, in which the colliding messages are actually executable files with widely different behaviors.

For SHA-1, the output has size 160 bits. This means that a generic collision attack has cost about 280, which is not attainable with existing technology (well, Mankind could do it, but certainly not discreetly: it should be doable with, say, the equivalent of one year of budget for the whole US Army). However, SHA-1, like MD5, has known weaknesses. Right now, these weaknesses are still theoretical, in that they lead to a collision attack with cost 261, which is too expensive for any single crypto research lab, and thus has not been fully conducted yet (there was an announced attack with cost 251 but it seems that it was a dud -- the analysis was flawed). So no actual collision to show (but researchers are pretty sure that the 261 attack is correct and would work, if someone found the budget).

With SHA-256, there is no known weakness, and the 256-bit output size implies a generic cost of 2128, far away into the undoable with today's and tomorrow's technology.

like image 156
Thomas Pornin Avatar answered Nov 26 '22 11:11

Thomas Pornin