Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL treats ÅÄÖ as AAO?

These two querys gives me the exact same result:

select * from topics where name='Harligt'; select * from topics where name='Härligt'; 

How is this possible? Seems like mysql translates åäö to aao when it searches. Is there some way to turn this off?

I use utf-8 encoding everywhere as far as i know. The same problem occurs both from terminal and from php.

like image 829
Martin Avatar asked Apr 09 '10 11:04

Martin


2 Answers

Yes, this is standard behaviour in the non-language-specific unicode collations.

9.1.13.1. Unicode Character Sets

To further illustrate, the following equalities hold in both utf8_general_ci and utf8_unicode_ci (for the effect this has in comparisons or when doing searches, see Section 9.1.7.7, “Examples of the Effect of Collation”):

Ä = A Ö = O Ü = U

See also Examples of the effect of collation

You need to either

  • use a collation that doesn't have this "feature" (namely utf8_bin, but that has other consequences)

  • use a different collation for the query only. This should work:

     select * from topics where name='Harligt' COLLATE utf8_bin; 

it becomes more difficult if you want to do a case insensitive LIKE but not have the Ä = A umlaut conversion. I know no mySQL collation that is case insensitive and does not do this kind of implicit umlaut conversion. If anybody knows one, I'd be interested to hear about it.

Related:

  • Looking for case insensitive MySQL collation where “a” != “ä”
  • MYSQL case sensitive search for utf8_bin field
like image 197
Pekka Avatar answered Sep 21 '22 19:09

Pekka


Since you are in Sweden I'd recommend using the Swedish collation. Here's an example showing the difference it makes:

CREATE TABLE topics (name varchar(100) not null) CHARACTER SET utf8;  INSERT topics (name) VALUES ('Härligt');  select * from topics where name='Harligt'; 'Härligt'  select * from topics where name='Härligt'; 'Härligt'      ALTER TABLE topics MODIFY name VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_swedish_ci;  select * from topics where name='Harligt'; <no results>  select * from topics where name='Härligt'; 'Härligt' 

Note that in this example I only changed the one column to Swedish collation, but you should probably do it for your entire database, all tables, all varchar columns.

like image 44
Mark Byers Avatar answered Sep 24 '22 19:09

Mark Byers