Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best MySQL data type to store MD5 hash or NULL

Tags:

php

mysql

md5

I have a PHP application which stores all accounts regardless of whether they are active or not in a single table. The table has a column called "active" which is either is NULL which means the account is active, or contains a MD5 hash which means the account is inactive.

According to Best practices for efficiently storing md5 hashes in mysql, if the column always contains a MD5 hash and never NULL, then BINARY(16) is preferred and CHAR(32) is the next best choice. Since most of my accounts are active and thus most of the column values will be NULL, am I better off using a different data type such as VARCHAR(32)?

like image 958
user1032531 Avatar asked Sep 27 '13 17:09

user1032531


2 Answers

There's no point to using VARCHAR. MD5 hashes are always 128-bit, so a CHAR(32) holds the string of hex digits, or BINARY(16) holds the string of bytes after you UNHEX() the hex digits.

Using NULL is independent of data type choice. MySQL can store NULL in lieu of a string, either CHAR or VARCHAR. But in fact, in InnoDB's default row format, MySQL does not store NULLs at all, it stores nothing for columns that are NULL.


Reference: http://dev.mysql.com/doc/internals/en/innodb-field-contents.html

  • Helpful Notes About NULLs:

    For the third row, I inserted NULLs in FIELD2 and FIELD3. Therefore in the Field Start Offsets the top bit is on for these fields (the values are 94 hexadecimal, 94 hexadecimal, instead of 14 hexadecimal, 14 hexadecimal). And the row is shorter because the NULLs take no space.

(emphasis mine)

like image 101
Bill Karwin Avatar answered Oct 18 '22 18:10

Bill Karwin


Note: Updated: 05-11-2019

Ideally you should not be using MD5 anymore to hash passwords. PHP manual has added a Safe Password Hashing section like 4 - 5 years ago now i believe where password_hash() and password_verify() are explained and why MD5 / SHA1 is not suitable..

Keep in mind as most RDMS are designed to deliver very stabile times as most RDMS buffer data in memory and or indexed which makes timing attacks very possible when the password column is in the WHERE clause

The safe way to use password_verify() in combination with SQL is in "pseudo" PHP code..

$row = prepare("SELECT password FROM users WHERE username = :username").execute().fetch(); 
if (password_verify($_POST['password'], $row->password)) { 
  // password correct
} else {
  // password incorrect..
}

In the MySQL source code in the file strings/ctype-bin.c the BINARY type is defined.

This looks like the default C ascii based charset converted into binary. This should in thoery be faster then the CHAR(32) with ascii_bin charset.

Because less time is needed to write / read the binary and it takes less diskspace in indexes and memory and because the CHAR(32) datatype is 16 bytes larger

If you want to use this you should use this php code

<?php
  md5 ( "password", true ); // true returns the binary what is 16 bytes long  MySQl BINARY(16)
?>
like image 3
Raymond Nijland Avatar answered Oct 18 '22 17:10

Raymond Nijland