Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL don't want to store unicode character

Why won't MySQL store the unicode character 𫗮? Yes, it is a rare hieroglyph, you wouldn't see it in the browser.
UTF16 is U+2B5EE

Warning: #1366 Incorrect string value: '\xF0\xAB\x97\xAE' for column 'ch' at row 1

Is it possible to store this character in MySQL?

like image 879
Qiao Avatar asked Apr 22 '10 15:04

Qiao


People also ask

Can MySQL Store Unicode?

MySQL supports these Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character.

Should I use UTF8 or utf8mb4?

The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.

What is UTF8 MySQL?

MySQL UTF-8 is actually a partial implementation of the full UTF-8 character set. Specifically, MySQL UTF-8 encoding uses a maximum of 3 bytes, whereas 4 bytes are required for encoding the full UTF-8 character set.


3 Answers

MySQL only supports characters from the basic multilingual plane (0x0000 - 0xFFFF).

Your character is out if this plane.

Try storing a synonym instead :)

Update:

MySQL 5.5.3 and on (which has not gone GA yet) does support supplementary characters if you use UTF8MB4 encoding.

like image 72
Quassnoi Avatar answered Oct 10 '22 03:10

Quassnoi


First: your statement

UTF16 is U+2B5EE

is slightly wrong. U+2B5EE is the notation for a Unicode codepoint, just a integer number- an abstract code- while UTF16 is a charset encoding (one of possible Unicode encodings, as is UTF-8).

Now, assuming that you mean the codepoint, U+2B5EE is outside the BMP (first 64K unicode codepoints), and it seems mysql have little or no support for them. So I suspect you are out of luck.

like image 2
leonbloy Avatar answered Oct 10 '22 02:10

leonbloy


Since this question was posted, MySQL 5.5.3 was released which supports the utf8mb4 encoding which offers full Unicode support. Switching to this charset instead of utf8 would fix your problem.

I’ve recently written a detailed guide on how to switch from MySQL’s utf8 to utf8mb4. If you follow the steps there, everything should work correctly. Here are direct links to each individual step in the process:

  • Step 1: Create a backup
  • Step 2: Upgrade the MySQL server
  • Step 3: Modify databases, tables, and columns
  • Step 4: Check the maximum length of columns and index keys
  • Step 5: Modify connection, client, and server character sets
  • Step 6: Repair and optimize all tables

I suspect that your problem can be solved by following step 5. Hope this helps!

like image 1
Mathias Bynens Avatar answered Oct 10 '22 03:10

Mathias Bynens