Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an MD5 Fixed Point where md5(x) == x?

Tags:

hash

md5

People also ask

Is MD5 fixed length?

MD5 processes an arbitrary-length message into a fixed-length output of 128 bits, typically represented as a sequence of 32 hexadecimal digits.

Can 2 MD5 hashes be the same?

Generally, two files can have the same md5 hash only if their contents are exactly the same. Even a single bit of variation will generate a completely different hash value. There is one caveat, though: An md5 sum is 128 bits (16 bytes).

Is MD5 reversible justify it?

Hash functions are not reversible in general. MD5 is a 128-bit hash, and so it maps any string, no matter how long, into 128 bits. Obviously if you run all strings of length, say, 129 bits, some of them have to hash to the same value.


Since an MD5 sum is 128 bits long, any fixed point would necessarily also have to be 128 bits long. Assuming that the MD5 sum of any string is uniformly distributed over all possible sums, then the probability that any given 128-bit string is a fixed point is 1/2128.

Thus, the probability that no 128-bit string is a fixed point is (1 − 1/2128)2128, so the probability that there is a fixed point is 1 − (1 − 1/2128)2128.

Since the limit as n goes to infinity of (1 − 1/n)n is 1/e, and 2128 is most certainly a very large number, this probability is almost exactly 1 − 1/e ≈ 63.21%.

Of course, there is no randomness actually involved – either there is a fixed point or there isn't. But, we can be 63.21% confident that there is a fixed point. (Also, notice that this number does not depend on the size of the keyspace – if MD5 sums were 32 bits or 1024 bits, the answer would be the same, so long as it's larger than about 4 or 5 bits).


My brute force attempt found a 12 prefix and 12 suffix match.

prefix 12: 54db1011d76dc70a0a9df3ff3e0b390f -> 54db1011d76d137956603122ad86d762

suffix 12: df12c1434cec7850a7900ce027af4b78 -> b2f6053087022898fe920ce027af4b78

Blog post: https://plus.google.com/103541237243849171137/posts/SRxXrTMdrFN


Since the hash is irreversible, this would be very hard to figure out. The only way to solve this, would be to calculate the hash on every possible output of the hash, and see if you came up with a match.

To elaborate, there are 16 bytes in an MD5 hash. That means there are 2^(16*8) = 3.4 * 10 ^ 38 combinations. If it took 1 millisecond to compute a hash on a 16 byte value, it would take 10790283070806014188970529154.99 years to calculate all those hashes.


While I don't have a yes/no answer, my guess is "yes" and furthermore that there are maybe 2^32 such fixed points (for the bit-string interpretation, not the character-string intepretation). I'm actively working on this because it seems like an awesome, concise puzzle that will require a lot of creativity (if you don't settle for brute force search right away).

My approach is the following: treat it as a math problem. We have 128 boolean variables, and 128 equations describing the outputs in terms of the inputs (which are supposed to match). By plugging in all of the constants from the tables in the algorithm and the padding bits, my hope is that the equations can be greatly simplified to yield an algorithm optimized to the 128-bit input case. These simplified equations can then be programmed in some nice language for efficient search, or treated abstractly again, assigning single bits at a time, watching out for contraditions. You only need to see a few bits of the output to know that it is not matching the input!