Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many multibyte characters can fit into a `TEXT` column?

Tags:

mysql

According to the documentation (emphasis mine):

TEXT[(M)] [CHARACTER SET charset_name] [COLLATE collation_name]

A TEXT column with a maximum length of 65,535 (216 − 1) characters. The effective maximum length is less if the value contains multibyte characters. Each TEXT value is stored using a 2-byte length prefix that indicates the number of bytes in the value.

Would it be more accurate to say that a TEXT column can store 65535 bytes? What is the specific impact of multibyte characters in a TEXT column?

Here's the source of my confusion:

In MySQL 5, CHAR and VARCHAR fields were changed so that they count characters instead of bytes (e.g., you can fit "你好,世界!" into a VARCHAR(6)). Did TEXT fields get the same treatment, or do they still count bytes?

like image 541
todofixthis Avatar asked Mar 10 '23 06:03

todofixthis


1 Answers

My knowledge: a character in utf-8 is max 32 Bit large (4-Byte).

Edit: utf8 is only max 3-Byte large in mysql. utf8mb4 is max 4-Byte large.

So the worst case with only the largest characters:

utf8: 65535 / 3 = 21845
utf8mb4: 65535 / 4 = 16383,75 =~ 16383

https://stackoverflow.com/a/9533324/2575671

Edit2:

I tested local with 10.1.21-MariaDB. Test characters utf-8:

1-Byte: a

2-Byte: ö

3-Byte: 好

4-Byte: 𠜎

utf8: 21845 @3-Byte (好)
utf8mb4: 16386  @4-Byte (𠜎)

Screenshot:

local test

http://i.imgur.com/5dmRteL.png

like image 67
Steffen Mächtel Avatar answered Mar 25 '23 08:03

Steffen Mächtel