Somewhere along the way, between all the imports and exports I have done, a lot of the text on a blog I run is full of weird accented A characters.
When I export the data using mysqldump and load it into a text editor with the intention of using search-and-replace to clear out the bad characters, searching just matches every "a" character.
Does anyone know any way I can successfully hunt down these characters and get rid of them, either directly in MySQL or by using mysqldump and then reimporting the content?
You can remove special characters from a database field using REPLACE() function.
If you want to detect hidden or unwanted characters as part of an initial diagnosis, use LENGTH . Then, use TRIM to get rid of unwanted characters. We also discussed how you can nest SQL functions, which is a powerful technique in programming.
UPDATE tablename SET columnToCheck = REPLACE(CONVERT(columnToCheck USING ascii), '? ', '') WHERE ... If the database is encoded UTF8 (mysql default), the first code will delete all characters with accents, which could be undesirable.
Backslash ( \ ) and the quote character used to quote the string must be escaped. In certain client environments, it may also be necessary to escape NUL or Control+Z.
This is an encoding problem; the Â
is a non-breaking space (HTML entity
) in Unicode being displayed in Latin1.
You might try something like this... first we check to make sure the matching is working:
SELECT * FROM some_table WHERE some_field LIKE BINARY '%Â%'
This should return any rows in some_table
where some_field
has a bad character. Assuming that works properly and you find the rows you're looking for, try this:
UPDATE some_table SET some_field = REPLACE( some_field, BINARY 'Â', '' )
And that should remove those characters (based on the page you linked, you don't really want an nbsp there as you would end up with three spaces in a row between sentences etc, you should only have one).
If it doesn't work then you'll need to look at the encoding and collation being used.
EDIT: Just added BINARY
to the strings; this should hopefully make it work regardless of encoding.
The accepted answer did not work for me.
From here http://nicj.net/mysql-converting-an-incorrect-latin1-column-to-utf8/ I have found that the binary code for  character is c2a0 (by converting the column to VARBINARY and looking what it turns to). Then here http://www.oneminuteinfo.com/2013/11/mysql-replace-non-ascii-characters.html found the actual solution to remove (replace) it:
update entry set english_translation = unhex(replace(hex(english_translation),'C2A0','20')) where entry_id = 4008;
The query above replaces it to a space, then a normal trim can be applied or simply replace to '' instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With