Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best character set for email field?

Tags:

mysql

  • binary
  • utf8_bin
  • utf8_unicode_ci
  • utf8_general_ci

Which one is the best for storing unique emails in a MySql database?

Note: That email field will be used for user login.

like image 409
Yves Avatar asked Jan 17 '23 07:01

Yves


1 Answers

An e-mail address is a piece of text. Therefore, do not use binary, use text.

Utf8 seems to be a good choice. I am not sure what characters are supported for email addresses, but one can expect that there will increasingly be more unicode characters allowed in the future. Especially if you use utf8 elsewhere in your database, you don't have to switch from one encoding to another one, just use utf8 for everything.

As for choosing between utf8_bin, utf8_unicode_ci and utf8_general_ci, the difference is only the collation. This means it makes a difference when comparing the strings.

Now here you have to choose between what is allowed and what is normal. Normally, email addresses are case-insensitive, but they could be case-sensitive.

So if you use a unique index on your e-mail column, and want to allow for email addresses differing only in their capitalization, you should use utf8_bin, since collations ending with _ci mean "case-insensitive".

If you use a unique index and want to avoid emails differing only in their capitalization, then use utf8_unicode_ci.

That being said, I use utf8_unicode_ci. I want the db to be able to recognise [email protected] and [email protected] as the same address. It is much more useful than allowing for the possibility of addresses with same characters and different capitalization.

like image 115
Dil Avatar answered Jan 18 '23 22:01

Dil