Find $secret in $a == md5($b . $secret)

Question

A function is:

$a == md5($b . $secret);

You can choose $a and $b
You don't know $secret
You get the value of the function for the $a and $b you choose as either true or false.

Is there any better attack than brute force to find the $secret in general? Is there any better attack than brute force to find the $secret using PHP's md5 function?

From what I've found on the web I think there is not, though md5 is deprecated for some other use cases. So just to be sure ...

Kind regards

Thomas Pornin · Accepted Answer

If MD5 behaves like a random oracle (that's a big "if", see below), then exhaustive search on $secret is the best possible attack -- and, more importantly, each "guess" on the value of $secret MUST use a query to the function (since you use PHP, I assume that the function is implemented in a Web server, and each "query" requires talking to that server). The latter is implied by the lack of information sent back to the attacker: the attacker does not get anything except a single bit (the "True" or "False" result). In particular, he does not get the MD5 output itself. The attacker will get a long stream of uninformative "False" results unless he hits the correct MD5 output, either out of pure chance (probability 2^-128, which is Really Darn Small), or because he properly guessed the value of $secret beforehand. It is worth noticing that this prevents the attacker from using many cost-sharing techniques, including precomputed tables, in particular the over-hyped rainbow tables.

A random oracle is a mythical object which can be viewed as a deterministic black box: you know nothing on the output you will get from a given input, except that the box will always return the same result for a given input. A model is the following: the box contains a gnome, some dice and a big book. The gnome uses the dice to randomly choose the output. He also uses the book to keep track of the answers he already sent, so as to be consistent, i.e. if the input is identical to a previously submitted input, the gnome will return the same output than previously instead of throwing dice.

MD5, however, is not a random oracle. For instance, we can build collisions for MD5 much faster than the theoretical 2⁶⁴ resistance for a function with a 128-bit output. Also, note that being a good hash function (collision-resistant and so on) does not absolutely require "random-oracleness". For instance, SHA-256 is considered to be a secure hash function, while it still suffers from the so-called "length extension attack" (given SHA256($a), one can compute SHA256($a . $b) without knowing $a, for almost arbitrary values of $b). So the guarantees of a random oracle do not hold with MD5 (or, for that matter, SHA-256). This does not mean that faster attacks are known ! Only that you are on your own here.

One can also point out that md5($b . $secret) is a kind of "keyed hash", i.e. a MAC (Message Authentication Code). Building a MAC out of a hash function is not easy, precisely because of things like the length extension attack (md5($secret . $b), for instance, would be a very poor MAC). A robust way of building a MAC out of a hash function has been designed; it is called HMAC and involves two invocations of the underlying hash function (but one of them is on a short input, so this is efficient nonetheless). Security of HMAC, more precisely how HMAC can be considered to behave as a random oracle, can be "proven", i.e. reduced to some hash function internal properties which are believed true in the case of SHA-256 (see New Proofs for NMAC and HMAC: Security without Collision-Resistance by Mihir Bellare for the gory details). By using HMAC/SHA-256 over $b, with $secret as key, you would benefit from these security results, and your construction would be more convincingly robust. Then again, I do not claim that there is a known attack on md5($b . $secret), only that both the use of MD5 and the homemade MAC construction raise red flags, which degrade the level of trust which can be imparted to such a system.

Demento · Answer

This is an interesting question, because in the typical scenarios in IT-Security you cannot choose $aand $b as an attacker. If you are able to get hold of a hashed password for example, $aand $b are already defined and you have to work with that. In that case you can only use brute-force or a rainbow table if one with the salt $b is available.

In your example on the other hand, you are free to choose both values. You can take an arbitrary secret, e.g. test and choose the values for $aand $b accordingly. I choose $b to be an empty string and calculate $a with $a = md5($secret), which results in 098f6bcd4621d373cade4e832627b4f6.

I choose $a = "098f6bcd4621d373cade4e832627b4f6" and $b= "" and ask you if $secret == "test". You say true and I say problem solved.

This finally leads us to the real answer. The two conditions given

You can choose $a and $b
You don't know $secret

do not work together. In my example I defined $secret myself. I violated the second condition. On the other hand I cannot choose $aand $b arbitrarily without deriving them from $secret, because they might not have a solution.

If we assume that there is at least one solution for all possible pairs of $aand $b (maybe there is proof for that, I don't know), and you select them in a way that you really do not know $secret, I would always want to define $b = "", to make the attack as easy as possible. Rainbow tables are your friends in that case.

Sourav · Answer

download a rainbow table / password hash of famous password ! :)

Find $secret in $a == md5($b . $secret)

Tags:

security

php

cryptography

md5

black-box

Niklas

3 Answers

Thomas Pornin

Demento

Sourav

Recent Activity

Donate For Us

Find $secret in $a == md5($b . $secret)

Tags:

security

php

cryptography

md5

black-box

Niklas

3 Answers

Thomas Pornin

Demento

Sourav

Related questions

Recent Activity

Donate For Us