Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between preimage resistance and second-preimage resistance

Wikipedia says:

  • preimage resistance: for essentially all pre-specified outputs, it is computationally infeasible to find any input which hashes to that output, i.e., it is difficult to find any preimage x given a "y" such that h(x) = y.

  • second-preimage resistance: it is computationally infeasible to find any second input which has the same output as a specified input, i.e., given x, it is difficult to find a second preimage x' ≠ x such that h(x) = h(x′).

Yet, I don't understand it. Doesn't h(x′) (where x' is input) generate that y (the output), which is then compared to the same h(x)?

Say, I have a string "example". It generates the MD5 "1a79a4d60de6718e8e5b326e338ae533". Why is it different to just use the MD5 compared to doing the MD5(example)?

like image 733
Det Avatar asked Jan 09 '23 22:01

Det


1 Answers

Ideal hashing is like taking the fingerprint of a person, it is unique, it is non-reversible (you can't get the whole person back just from the fingerprint) and it can serve as a short and simple identifier for the given person.

If we bring some of the terminology you introduced into our analogy, we see that preimage resistance refers to the hash function's ability to be non-reversible. Imagine if you could generate the likeness of a whole person from their fingerprint, aside from being really cool, this could also be very dangerous. For the same reason, hash functions must be made so that an attacker cannot find the original message that generated the hash. In that sense, hash functions are one-way in that the message generates the hash and not the other way round.

Second preimage resistance refers to a given hash function's ability to be unique. Forensic fingerprinting would be a gross waste of time if any number of individuals could share the same fingerprint (lets exclude identical twins for now. Edit: See Det's comment below). If a given hash was used for verification against data corruption, it would quite pointless if there is a good chance corrupt data can generate the same hash.

To have both preimage resistance and second preimage resistance hash functions adopt several traits to help them. One trait very common for hash functions is where the given input has no correspondence to the output. A single bit change can produce a hash that has completely no bytes shared with the hash of the original input. For this reason, a good hash function is commonly used in message authentication.

Whilst you are right comparing the original message directly would be functionally equivalent to comparing the hashes, it is simply not feasible in the majority of cases. For example:

If party A wanted to reliably send a message to party B, party A/B would need to agree upon a scheme to detect data corruption during transfer. Note: party B does not have the original message until party A sends it.

A possible scheme of transfer could be to transfer the message twice such that party B can verify if the second message equals the first. The problem with this is that there is a chance that corruption can occur twice in the same place (as well as the significantly higher bandwidth). This can only be reduced by sending the messages even more times, incurring severe bandwidth costs.

As an alternative, party A can pass his/her long message into a hash function and generate a short hash which he/she sends to party B, followed by the original message. Party B can then take the received message and pass it into the hash function and match the hashes. If either the message or the hash got corrupted even by a single bit during transfer, the resultant hashes will not match, thanks to second preimage resistance (no two plaintext should have same hash).

Preimage Resistance in this case would be useful if the message is encrypted during transfer but the hash was taken prior encryption (whether this is appropriate is another discussion). If the hash was reversible, a eavesdropper could intercept the hash and reverse to find the original message.

All hash functions are not equal, that's why its important to consider their preimage resistance/second preimage resistance when choosing which ones to use, which ones are secure and which ones should be deprecated and replaced.

like image 63
initramfs Avatar answered Apr 23 '23 22:04

initramfs