I've got a database with a bunch of broken utf8 characters scattered across several tables. The list of characters isn't very extensive AFAIK (áéíúóÁÉÍÓÚÑñ)
Fixing a given table is very straightforward
update orderItem set itemName=replace(itemName,'á','á');
But I can't get a way of detecting the broken characters. If I do something like
SELECT * FROM TABLE WHERE field LIKE "%Ã%";
I get nearly all the fields because of the collation (Ã=a). All broken characters so far start with an "Ã". The database is in spanish so this particular character isn't used
The list of broken chars I've got so far is
á = á é = é Ã- = í ó = ó ñ = ñ á = Á
Any idea of how to make this SELECT to work as intended? (a binary search or something like that)
You do that by calling str. valid_encoding? on a String str that is in UTF-8 -encoding. Does that not get clear from my answer? Programmatically, you can not (or at least not easily and of course not reliably) check the invalidity of a string in a one-byte-encoding such as CP1252 .
0xC0, 0xC1, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF are invalid UTF-8 code units. A UTF-8 code unit is 8 bits. If by char you mean an 8-bit byte, then the invalid UTF-8 code units would be char values that do not appear in UTF-8 encoded text.
I fixed with
UPDATE wp_zcs9ck_posts_copy SET post_title = CONVERT(BINARY CONVERT(post_title USING latin1) USING utf8);
Complete solution: http://jonisalonen.com/2012/fixing-doubly-utf-8-encoded-text-in-mysql/
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'á','á'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ä','ä'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'é','é'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'í©','é'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ó','ó'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'íº','ú'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ú','ú'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ñ','ñ'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'í‘','Ñ'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'Ã','í'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'–','–'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'’','\''); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'…','...'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'–','-'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'“','"'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'â€','"'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'‘','\''); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'•','-'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'‡','c'); UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'Â','');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With