So, I've had some issues while trying to come over from Latin1
encoded databases, tables as well as columns, and now that everything is finally in UTF-8, I can't seem to update a row in a column. I am trying to replace an "e" with an e with acute (é). But it gives me this:
ERROR 1366 (HY000): Incorrect string value: '\x82m ...' for column 'Name' at row 1
when running this:
UPDATE access SET Name='ém' WHERE id="2";
All databases gives me this when running the status command (except the current database
part of course):
Connection id: 1
Current database: access
Current user: root@localhost
SSL: Not in use
Using delimiter: ;
Server version: 5.1.47-community MySQL Community Server (GPL)
Protocol version: 10
Connection: localhost via TCP/IP
Server characterset: utf8
Db characterset: utf8
Client characterset: utf8
Conn. characterset: utf8
TCP port: 3306
Uptime: 20 min 16 sec
Threads: 1 Questions: 110 Slow queries: 0 Opens: 18 Flush tables: 1 Open tables: 11 Queries per second avg: 0.90
And running the chcp command in cmd gives me 850
. Oh, and at some points I got this:
ERROR 1300 (HY000): Invalid utf8 character string: 'ém' WHERE id="2"
I've looked everywhere for a solution, but I couldn't seem to find anything anywhere, and since I've always had good responses on Stackoverflow, I thought I'd ask here.
Thanks for any help!
The solution is to set the connection variables to whatever codepage your installation of windows uses (not latin1 like what a lot of pages out there recommend - cmd.exe's character encoding isn't latin1).
In my case the codepage is 850:
mysql> SET NAMES cp850;
Here's an example with the connection set to UTF-8:
mysql> show variables like '%char%';
+--------------------------+---------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | C:\xampp\mysql\share\charsets\ |
+--------------------------+---------------------------------+
8 rows in set (0.00 sec)
This is what happens to accented characters:
mysql> select nom from assignatura where nom like '%prob%';
+---------------------------------------+
| nom |
+---------------------------------------+
| Probabilitat i Processos Estocàstics |
| Probabilitat i Processos Estocàstics |
+---------------------------------------+
2 rows in set (0.03 sec)
Notice the extraneous ├
character just before the á
. Also the accent is the wrong direction, it should be à
.
After executing SET NAMES cp850;
:
mysql> show variables like '%char%';
+--------------------------+--------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------+
| character_set_client | cp850 |
| character_set_connection | cp850 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | cp850 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | C:\xampp\mysql\share\charsets\ |
+--------------------------+--------------------------------+
8 rows in set (0.00 sec)
We finally get the correct accented character:
mysql> select nom from assignatura where nom like '%prob%';
+--------------------------------------+
| nom |
+--------------------------------------+
| Probabilitat i Processos Estocàstics |
| Probabilitat i Processos Estocàstics |
+--------------------------------------+
2 rows in set (0.00 sec)
This thread, although somewhat old, seems to result in the conclusion that cmd.exe and the mysql client don't handle UTF-8 encoding properly (with the blame being more aimed at cmd.exe).
Reading in SQL from a file is recommended, as is using an alternative client - or a flavour of UNIX. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With