Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of padding an md5 message if it is already the right length?

Tags:

padding

hash

md5

I know the process for padding in md5, but what is the purpose of adding a 1 and several 0's to a message that is already the correct length?

Is this for security or just a marker?

like image 278
pclem12 Avatar asked Sep 13 '10 14:09

pclem12


1 Answers

The padding procedure must not create collisions. If you have a message m it is padded into pm, which has a length multiple of 512. Now imagine pm as a message m' in itself, i.e. the padding bits already added as if they were part of the message. If padding just keeps m' unchanged, as you suggest, then m and m' would yield the same hash value, even though they are distinct messages. That would be a collision, also known as "not good at all".

Generally speaking, the padding procedure must be such that it could potentially be unambiguously removed: you must be able to look at a padded message, and decide without hesitation which bits are from the message itself, and which were added as padding. Nothing in the course of the hash function actually removes the padding, but it must be conceptually feasible. This is kind of mathematically impossible if messages of length multiple of 512 are "padded" by adding no bit at all.

The above is generic to all hash functions. MD5 and a few functions of the same general family (including SHA-1, SHA-256...), using the Merkle-Damgård construction, also need the input data length to be encoded in the padding (this is necessary to achieve some security proofs). In MD5, the length is encoded as a 64-bit number. With the '1' bit, there are at least 65 padding bits for any message (and at most 511).

like image 155
Thomas Pornin Avatar answered Sep 18 '22 20:09

Thomas Pornin