I am doing unit tests on requests generators, and I get in trouble with LENGTH
function.
I have 2 requests that follows each other :
SHOW VARIABLES LIKE '%character%'
Returns the following result :
array(8) {
[0] =>
array(2) {
'Variable_name' =>
string(20) "character_set_client"
'Value' =>
string(4) "utf8"
}
[1] =>
array(2) {
'Variable_name' =>
string(24) "character_set_connection"
'Value' =>
string(4) "utf8"
}
[2] =>
array(2) {
'Variable_name' =>
string(22) "character_set_database"
'Value' =>
string(6) "latin1"
}
[3] =>
array(2) {
'Variable_name' =>
string(24) "character_set_filesystem"
'Value' =>
string(6) "binary"
}
[4] =>
array(2) {
'Variable_name' =>
string(21) "character_set_results"
'Value' =>
string(4) "utf8"
}
[5] =>
array(2) {
'Variable_name' =>
string(20) "character_set_server"
'Value' =>
string(4) "utf8"
}
[6] =>
array(2) {
'Variable_name' =>
string(20) "character_set_system"
'Value' =>
string(4) "utf8"
}
[7] =>
array(2) {
'Variable_name' =>
string(18) "character_sets_dir"
'Value' =>
string(26) "/usr/share/mysql/charsets/"
}
}
My second request is :
SELECT LENGTH('重庆') as len
It returns 6 instead of 2.
What's wrong here ? My charset parameters looks good.
MySQL LENGTH() Function The LENGTH() function returns the length of a string (in bytes).
MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character.
The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters. The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255.
I found my answer in the MySQL documentation :
The LENGTH
function counts bytes :
mysql> SELECT LENGTH('重庆') ;
+------------------+
| LENGTH('重庆') |
+------------------+
| 6 |
+------------------+
1 row in set (0.00 sec)
The CHAR_LENGTH
function counts characters :
mysql> SELECT CHAR_LENGTH('重庆') ;
+-----------------------+
| CHAR_LENGTH('重庆') |
+-----------------------+
| 2 |
+-----------------------+
1 row in set (0.00 sec)
They both work completely different:
Once LENGTH() returns always the length of the string by bytes. CHAR_LENGTH() is gonna return the length of the string by characters.
Once you are using Unicode, in which most characters are encoded in two bytes, It is always gonna be different. Or even when we are talking about UTF-8, where the number of bytes varies all the time.
e.g.:
SELECT LENGTH('重庆'), CHAR_LENGTH('重庆');
--> 6, 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With