What is the difference between utf8mb4
and utf8
charsets in MySQL?
I already know about ASCII, UTF-8, UTF-16 and UTF-32 encodings; but I'm curious to know whats the difference of utf8mb4
group of encodings with other encoding types defined in MySQL Server.
Are there any special benefits/proposes of using utf8mb4
rather than utf8
?
MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character. This character set is deprecated in MySQL 8.0, and you should use utfmb4 instead.
My recommendation is to convert all tables to utf8mb4 for full UTF-8 support. Also, utf8mb4 is backwards compatible with utf8 .
UTF-8 is a variable-length encoding. In the case of UTF-8, this means that storing one code point requires one to four bytes. However, MySQL's encoding called "utf8" (alias of "utf8mb3") only stores a maximum of three bytes per code point.
To solve the problem open the exported SQL file, search and replace the utf8mb4 with utf8 , after that search and replace the utf8mb4_unicode_520_ci with utf8_general_ci . Save the file and import it into your database. After that, change the wp-config. php charset option to utf8 , and the magic starts.
Few years later, when MySQL 5.5.3 was released, they introduced a new encoding called utf8mb4, which is actually the real 4-byte utf8 encoding that you know and love. if you're using MySQL (or MariaDB or Percona Server), make sure you know your encodings.
The character set named utf8utf8mb3] uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:
As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters: For a BMP character, utf8utf8mb3] and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.
So the character set "utf8"/"utf8mb3" cannot store all Unicode code points: it only supports the range 0x000 to 0xFFFF, which is called the " Basic Multilingual Plane ". See also Comparison of Unicode encodings.
UTF-8 is a variable-length encoding. In the case of UTF-8, this means that storing one code point requires one to four bytes. However, MySQL's encoding called "utf8" (alias of "utf8mb3") only stores a maximum of three bytes per code point.
So the character set "utf8"/"utf8mb3" cannot store all Unicode code points: it only supports the range 0x000 to 0xFFFF, which is called the "Basic Multilingual Plane". See also Comparison of Unicode encodings.
This is what (a previous version of the same page at) the MySQL documentation has to say about it:
The character set named utf8[/utf8mb3] uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:
For a BMP character, utf8[/utf8mb3] and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.
For a supplementary character, utf8[/utf8mb3] cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8[/utf8mb3] cannot store the character at all, you do not have any supplementary characters in utf8[/utf8mb3] columns and you need not worry about converting characters or losing data when upgrading utf8[/utf8mb3] data from older versions of MySQL.
So if you want your column to support storing characters lying outside the BMP (and you usually want to), such as emoji, use "utf8mb4". See also What are the most common non-BMP Unicode characters in actual use?.
The utf8mb4
character set is useful because nowadays we need support for storing not only language characters but also symbols, newly introduced emojis, and so on.
A nice read on How to support full Unicode in MySQL databases by Mathias Bynens can also shed some light on this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With