How to MySQL work "case insensitive" and "accent insensitive" in UTF-8

Tags:

I have a schema in "utf8 -- UTF-8 Unicode" as charset and a collation of "utf8_spanish_ci".

All the inside tables are InnoDB with same charset and collation as mentioned.

Here comes the problem:

with a query like

SELECT *
FROM people p
WHERE p.NAME LIKE '%jose%';

I get 83 result rows. I should have 84 results, because I know it.

Changing where for:

WHERE p.NAME LIKE '%JOSE%';

I get the exact same 83 rows. With combinations like JoSe, Jose, JOSe, etc. All the same 83 rows are reported.

The problem comes when accents play in game. If do:

WHERE p.NAME LIKE '%josé%';

I get no results. 0 rows.

But if I do:

WHERE p.NAME LIKE '%JOSÉ%';

I get just one resulting row, so 1 row. This is the only row which has accented "jose" and capitalized.

I've tried with josÉ, or JoSÉ or whatever combination I do, as long as the accented letter stays capitalized or not, as it really is stored in the database and it stills returning the only row. If I suddenly change "É" for "é" in whatever combination I do with the capitalization in JOSE, it returns no rows.

So conclusions:

Case insensitive if no latin characters plays in game.
Case sensitive if latin characters appears.
Accent sensitive, as if I search JOSE or jose, I only get 83 rows, instead of the 84 rows I need.

What I want?

To search "jose", "JOSE", "José", "JOSÉ", "JÒSE", "jöse", "JoSÈ", ... have to return the 84 rows I know that exists. I what to turn my searches to case insensitive and "latin" insensitive.

Solutions like COLLATION on LIKE doesn't work for me, don't know why...

What can I do?

EDIT:

If I do something like:

WHERE p.NAME LIKE '%jose%' COLLATE utf8_general_ci;

I get the error:

COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'

And I've changed all the possible collations on the columns too!

And if I do something like:

WHERE p.NAME LIKE _utf8 '%jose%' COLLATE utf8_general_ci;

The same 83 rows are reported, as if I've made nothing...

917

asked May 31 '12 09:05

Lightworker

1 Answers

You have already tried to use an accent-insensitive collation for your search and ordering.

http://dev.mysql.com/doc/refman/5.0/en/charset-collation-implementations.html

The thing is, your NAME column seems to be stored in the latin1 (8-bit) character set. That's why mySQL is grumbling at you like this:

  COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'

You may get the results you want if you try

 WHERE CONVERT(p.NAME USING utf8) LIKE _utf8 '%jose%' COLLATE utf8_general_ci;

But, be careful!

When you use any kind of function (in this example, CONVERT) on the column in a WHERE statement, you defeat MySQL's attempts to optimize your search with indexes. If this project is going to get large (that is, if you will have lots of rows in your tables) you need to store your data in utf8 format, not latin1. (You probably already know that your LIKE '%whatever%' search term also defeats MySQL's indexing.)

answered Oct 23 '22 20:10

O. Jones

Related questions
                            
                                What is the difference between isset and empty?
                            
                                java.lang.ClassNotFoundException: org.postgresql.Driver, Android
                            
                                Can't refactor rename in Eclipse
                            
                                Git submodule is returning blank?
                            
                                Specify JRE for IntelliJ Idea on Windows
                            
                                How do Django models work?
                            
                                Convert kilometers to radians
                            
                                How to override <base> tag without removing the tag itself?
                            
                                How do I initialize the t-variables in "A Fast Voxel Traversal Algorithm for Ray Tracing"?
                            
                                VS2012: Property Pages isn't opening: Object reference not set to an instance of an object
                            
                                Is there any way to view own svn permissions
                            
                                Outline effects in OpenGL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With