For some reason while using PhpMyAdmin returns 90 rows when running:
SELECT COUNT(*)
FROM le_wp_posts
WHERE post_content LIKE '%Â%'
But the following updates 3 rows only:
UPDATE le_wp_posts
SET post_content = REPLACE(post_content, 'Â', '')
WHERE post_content LIKE '%Â%'
I have also tried it omitting the WHERE
clause in the UPDATE
statement. Is there any obvious reason I'm overlooking that's causing this issue? Or what steps can I take further to investigate the cause? My SQL is not the best.
I did the following test...
1) Create a table with some data:
create table test(col varchar(10));
insert into test values ('abc'), ('dbe');
2) Select number of rows using your same filter (but different character):
select count(*)
from test
where col like '%B%' -- note the uppercase
;
Got the following result:
+----------+ | count(*) | +----------+ | 2 | +----------+ 1 row in set
3) Tried your update:
update test
set col = replace(col, 'B', '') -- note the uppercase
where col like '%B%' -- note the uppercase
;
And got this result:
Query OK, 0 rows affected (0.01 sec) Rows matched: 2 Changed: 0 Warnings: 0
In my case, a default character set and collation where used on table creation. The default character set was 'latin1' and collation 'latin1_swedish_ci'. Note the ci
at the end of the collation.... it means case insensitive. So, the LIKE
filter did a case insensitive search, found 2 rows, but the REPLACE
function, as can be seen on documentation, is case sensitive. Probably, as in my case, the update found the same number of rows as in the select, but updated less data because of the case restriction on REPLACE
.
If this is your problem, can't you just run two updates, one for the uppercase case and one for the lowercase? I'll try to develop a solution on one update...
The docs about the REPLACE(str, from_str, to_str)
function:
Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.
The docs about the LIKE
operator:
The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a case sensitive (uses a case-sensitive collation or is a binary string):
The first example:
mysql> SELECT 'abc' LIKE 'ABC'; -> 1
The second example:
mysql> SELECT 'abc' LIKE _latin1 'ABC' COLLATE latin1_general_cs; -> 0
Note the cs
at the end of the collation. It means case sensitive.
If you take a utf8-encoded £
(C2A3
, treated as utf8) and store it into a latin1 column, when you read it back, you get £
(C2A3
, treated as latin1). Removing the Â
will work for about 32 characters, but will fail for many other characters. And it will make the table harder to repair!
Let's look at an example of what you tried to store, together with the HEX
of that ended up in the table. Also, let's look at SHOW CREATE TABLE
to confirm my suspicion that the target it latin1
.
This discusses the HEX
debugging technique. And it discusses "Best Practice", which includes declaring, during the connection, that you really have utf8, not latin1. And it talks about "Mojibake", with an example of where ñ
turns into ñ
, making REPLACE
a messy prospect.
Your symptom with LIKE
is consistent with character set mismatches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With