Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between BINARY(16) and CHAR(32) when storing MD5 in database

Based on various recommendations such as What data type to use for hashed password field and what length?, I could store md5 as either CHAR(32) or BINARY(16). But when I do so using BINARY(16), the stored value is identical to the first 16 characters of the CHAR(32) stored results as well as the first 16 characters of the SELECT MD5() results. What are the significance of the later 16 characters, and is their lack of presence in the binary column resulting in lost data?

CREATE  TABLE test (id INT NOT NULL AUTO_INCREMENT, value VARCHAR(6), md5_char CHAR(32) NOT NULL, md5_binary BINARY(16) NOT NULL, PRIMARY KEY (id)) ENGINE = InnoDB;
INSERT INTO test(value,md5_char,md5_binary) VALUES("one!",md5("one!"),md5("one!"));
INSERT INTO test(value,md5_char,md5_binary) VALUES("two%",md5("two%"),md5("two%"));
INSERT INTO test(value,md5_char,md5_binary) VALUES("three~",md5("three~"),md5("three~"));
SELECT value,md5(value),md5_char,md5_binary FROM test;
DROP TABLE test;

+--------+----------------------------------+----------------------------------+------------------+
| value  | md5(value)                       | md5_char                         | md5_binary       |
+--------+----------------------------------+----------------------------------+------------------+
| one!   | 633c8403325f1cf963809e6eb224d77e | 633c8403325f1cf963809e6eb224d77e | 633c8403325f1cf9 |
| two%   | 48bbec047b4451a2018e0f652807b7d0 | 48bbec047b4451a2018e0f652807b7d0 | 48bbec047b4451a2 |
| three~ | fee453bb4eb68dcdfee07575e75c8cc5 | fee453bb4eb68dcdfee07575e75c8cc5 | fee453bb4eb68dcd |
+--------+----------------------------------+----------------------------------+------------------+
like image 652
user1032531 Avatar asked Sep 28 '13 17:09

user1032531


1 Answers

Currently you are losing half of the checksum when using BINARY(16). When you store an MD5 checksum in BINARY(16) you should store it as binary data, not encoded in hexadecimal. That is:

INSERT INTO test (md5_binary) VALUES(UNHEX(md5("one!")));

You can use the HEX function to encode it into hex again if you want to eye-ball compare it with another checksum:

SELECT HEX(md5_binary) FROM test;

The benefit of using BINARY to store the checksum instead of hexadecimal text is that half the storage is needed.

like image 85
Joni Avatar answered Nov 02 '22 02:11

Joni