Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Example of hash collision (printable strings) [closed]

I've searched a lot for MD5 hash collision, but I've found binary examples only. I would like to find two UTF-8 strings, which have the same MD5 hash. Are there any, or does the collision only work for binary data?

like image 796
Iter Ator Avatar asked Oct 06 '14 13:10

Iter Ator


2 Answers

It's definitely possible:

  • We all agree there are collisions for MD5 due to the birthday paradox - we are mapping infinitely many possible inputs to elements belonging to a finite sequence.
  • There is a solid chance that there are infinitely many collisions: we are able to produce infinite pairs of input and MD5 tries to map them uniformly.

By that alone some of these collisions are bound to be valid UTF-8 strings, but they're extremely rare, since most of these will be just random binary garbage.

If you absolutely need to find such messages, I recommend using collision finder written by Patrick Stach, which should return pair of arbitrary messages within a few hours, or my attempt to improve it. The latter uses techniques presented in later papers by Wang (the first person to demonstrate examples of MD5 collisions), Lian, Sasaki, Yajima and Klima.

I think you could also use length extension attack to some extent, but it requires deeper understanding of what happens inside MD5.

like image 131
rr- Avatar answered Sep 30 '22 15:09

rr-


There are UTF-8 collisions. By the nature of cryptographic hashes, finding them is intentionally difficult, even for a hash as broken as MD5.

You might search for MD5 Rainbow Tables, which can be used for password cracking, and hence for UTF-8 strings. As @alk pointed out, a brute force search is going to take a very long time.

like image 37
rossum Avatar answered Sep 30 '22 15:09

rossum