Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL REPLACE affects 0 rows but WHERE ... LIKE returns 90

For some reason while using PhpMyAdmin returns 90 rows when running:

SELECT COUNT(*) 
FROM le_wp_posts 
WHERE post_content LIKE '%Â%'

But the following updates 3 rows only:

UPDATE le_wp_posts 
SET post_content = REPLACE(post_content, 'Â', '') 
WHERE post_content LIKE '%Â%'

I have also tried it omitting the WHERE clause in the UPDATE statement. Is there any obvious reason I'm overlooking that's causing this issue? Or what steps can I take further to investigate the cause? My SQL is not the best.

like image 405
rickyduck Avatar asked Sep 19 '16 22:09

rickyduck


2 Answers

I did the following test...

1) Create a table with some data:

create table test(col varchar(10));

insert into test values ('abc'), ('dbe');

2) Select number of rows using your same filter (but different character):

select count(*)
from test
where col like '%B%' -- note the uppercase
;

Got the following result:

+----------+                                                                                                                                                                
| count(*) |                                                                                                                                                                
+----------+                                                                                                                                                                
|        2 |                                                                                                                                                                
+----------+ 

1 row in set

3) Tried your update:

update test 
set col = replace(col, 'B', '') -- note the uppercase
where col like '%B%' -- note the uppercase
;

And got this result:

Query OK, 0 rows affected (0.01 sec)                                                                                                                                        
Rows matched: 2  Changed: 0  Warnings: 0

In my case, a default character set and collation where used on table creation. The default character set was 'latin1' and collation 'latin1_swedish_ci'. Note the ci at the end of the collation.... it means case insensitive. So, the LIKE filter did a case insensitive search, found 2 rows, but the REPLACE function, as can be seen on documentation, is case sensitive. Probably, as in my case, the update found the same number of rows as in the select, but updated less data because of the case restriction on REPLACE.

If this is your problem, can't you just run two updates, one for the uppercase case and one for the lowercase? I'll try to develop a solution on one update...

The docs about the REPLACE(str, from_str, to_str) function:

Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.

The docs about the LIKE operator:

The following two statements illustrate that string comparisons are not case sensitive unless one of the operands is a case sensitive (uses a case-sensitive collation or is a binary string):

The first example:

mysql> SELECT 'abc' LIKE 'ABC';
        -> 1

The second example:

mysql> SELECT 'abc' LIKE _latin1 'ABC' COLLATE latin1_general_cs;
        -> 0

Note the cs at the end of the collation. It means case sensitive.

like image 186
Felypp Oliveira Avatar answered Sep 28 '22 18:09

Felypp Oliveira


If you take a utf8-encoded £ (C2A3, treated as utf8) and store it into a latin1 column, when you read it back, you get £ (C2A3, treated as latin1). Removing the  will work for about 32 characters, but will fail for many other characters. And it will make the table harder to repair!

Let's look at an example of what you tried to store, together with the HEX of that ended up in the table. Also, let's look at SHOW CREATE TABLE to confirm my suspicion that the target it latin1.

This discusses the HEX debugging technique. And it discusses "Best Practice", which includes declaring, during the connection, that you really have utf8, not latin1. And it talks about "Mojibake", with an example of where ñ turns into ñ, making REPLACE a messy prospect.

Your symptom with LIKE is consistent with character set mismatches.

like image 45
Rick James Avatar answered Sep 28 '22 18:09

Rick James