Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MD5 Hash and Base64 encoding

If I have a 32 character string (an MD5 hash) and I encode it using Base64, what's the maximun length of the encoded string?

like image 902
Joaquín L. Robles Avatar asked Nov 25 '10 14:11

Joaquín L. Robles


People also ask

Does MD5 use Base64?

In many applications, the MD5 algorithm is used which produces a 128-bit output which is represented as a sequence of 32 hexadecimal digits. This output is further encoded using a base62 or base64 scheme.

What is the difference between MD5 and Base64?

The other difference is the length of the hash. The length of a Base64 encoded string varies, because it contains the original data. However the length of SHA1 and MD5 hashes are fixed (20 byte for SHA1 and 16 byte for MD5). Save this answer.

Is MD5 encoded?

An MD5 hash is NOT encryption. It is simply a fingerprint of the given input. However, it is a one-way transaction and as such it is almost impossible to reverse engineer an MD5 hash to retrieve the original string.

Is Base64 encryption or hashing?

Common strong encryption algorithms include: AES, Blowfish, and RSA. Encoding, hashing, and encryption can be used together. A base64 encoded message to an application may be hashed so the integrity of that message can be verified by the receiver.


2 Answers

An MD5 value is always 22 (useful) characters long in Base64 notation. Many Base64 algorithms will also append 2 characters of padding when encoding an MD5 hash, bringing the total to 24 characters. The padding adds no useful information and can be discarded. Only the first 22 characters matter.

Here's why:

An MD5 hash is a 128-bit value. Every character in a Base64 string contains 6 bits of information, because there are 64 possible values for the character, and it takes 6 powers of 2 to reach 64. With 6 bits of information in every character, 21 characters has 126 bits of information, and 22 characters contains 132 bits of information. Since 128 bits cannot fit within 21 characters but does fit within 22 characters (with a little room to spare), a 128-bit value will always be represented as 22 characters in Base64.

A note on the padding:

I mentioned above that many Base64 encoding algorithms add a couple of characters of padding when encoding an MD5 value. This is because Base64 represents 3 bytes of information as 4 characters. Since MD5 has 16 bytes of information, many Base64 encoding algorithms append "==" to designate that the input of 16 bytes was 2 bytes short of the next multiple of 3, which would have been 18 bytes. These 2 equal signs add no information whatsoever to the string, and can be discarded when storing.

like image 89
Thomas Albright Avatar answered Sep 19 '22 12:09

Thomas Albright


As per http://en.wikipedia.org/wiki/Base64

"Note that given an input of n bytes, the output will be (n + 2 - ((n + 2) % 3)) / 3 * 4 bytes long, which converges to n * 4 / 3 or 1.33333n for large n."

So, it will be ((32 + 2 - (32 + 2) % 3)) / 3 * 4 = 34 - (34 % 3) / 3 * 4 = (34 - 1) / 3 * 4 = 33/3*4 = 44 characters.

You could always extract it in raw binary form (128 bits) and encode it directly into base 64, which means converting 16 bytes instead of 32, which becomes 24 bytes when base 64 encoded.

like image 21
Arantor Avatar answered Sep 18 '22 12:09

Arantor