I'm currently developing a website that is going to show stuff for almost any language in the world. And I'm having problems choosing the best collation to define in the MySQL.
Which one is the best to support all characters? Or the most accurate?
Or is just best to convert all characters to unicode?
The accepted answer is wrong (maybe it was right in 2009).
utf8mb4_unicode_ci
is the best encoding to use for wide language support.
Reasoning and supporting evidence:
You want to use
utf8mb4
rather thanutf8
because the latter only supports 3 byte characters, and you want to support 4 byte characters. (ref)
and
You want to use
unicode
rather thangeneral
because the latter never sorted correctly. (ref)
I generally use 8-bit UCS/Unicode transformation format which works perfect for any (well most) languages
utf8_general_ci
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With