I'm wondering if there is a "best" choice for collation in MySQL for a general website where you aren't 100% sure of what will be entered? I understand that all the encodings should be the same, such as MySQL, Apache, the HTML and anything inside PHP.
In the past I have set PHP to output in "UTF-8", but which collation does this match in MySQL? I'm thinking it's one of the UTF-8 ones, but I have used utf8_unicode_ci
, utf8_general_ci
, and utf8_bin
before.
For any version of MySQL or MariaDB, use utf8mb4 with its default COLLATION .
MySQL supports these Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character.
utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.
By manoj on April 23rd, 2018. Changing the Database Collation in PhpMyAdmin. A collation is a set of rules that defines how to compare and sort character strings. Every character set has at least one collation. The default character set for MySQL is latin1, with a default database collation of latin1_swedish_ci.
Actually, you probably want to use utf8_unicode_ci
or utf8_general_ci
.
utf8_general_ci
sorts by stripping away all accents and sorting as if it were ASCIIutf8_unicode_ci
uses the Unicode sort order, so it sorts correctly in more languagesHowever, if you are only using this to store English text, these shouldn't differ.
The main difference is sorting accuracy (when comparing characters in the language) and performance. The only special one is utf8_bin which is for comparing characters in binary format.
utf8_general_ci
is somewhat faster than utf8_unicode_ci
, but less accurate (for sorting). The specific language utf8 encoding (such as utf8_swedish_ci
) contain additional language rules that make them the most accurate to sort for those languages. Most of the time I use utf8_unicode_ci
(I prefer accuracy to small performance improvements), unless I have a good reason to prefer a specific language.
You can read more on specific unicode character sets on the MySQL manual - http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With