According to the documentation (emphasis mine):
TEXT[(M)] [CHARACTER SET charset_name] [COLLATE collation_name]
A TEXT column with a maximum length of 65,535 (216 − 1) characters. The effective maximum length is less if the value contains multibyte characters. Each TEXT value is stored using a 2-byte length prefix that indicates the number of bytes in the value.
Would it be more accurate to say that a TEXT
column can store 65535 bytes? What is the specific impact of multibyte characters in a TEXT
column?
Here's the source of my confusion:
In MySQL 5, CHAR
and VARCHAR
fields were changed so that they count characters instead of bytes (e.g., you can fit "你好,世界!" into a VARCHAR(6)
). Did TEXT
fields get the same treatment, or do they still count bytes?
My knowledge: a character in utf-8 is max 32 Bit large (4-Byte).
Edit: utf8 is only max 3-Byte large in mysql. utf8mb4 is max 4-Byte large.
So the worst case with only the largest characters:
utf8: 65535 / 3 = 21845
utf8mb4: 65535 / 4 = 16383,75 =~ 16383
https://stackoverflow.com/a/9533324/2575671
Edit2:
I tested local with 10.1.21-MariaDB. Test characters utf-8:
1-Byte: a
2-Byte: ö
3-Byte: 好
4-Byte: 𠜎
utf8: 21845 @3-Byte (好)
utf8mb4: 16386 @4-Byte (𠜎)
Screenshot:
http://i.imgur.com/5dmRteL.png
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With