Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP PDO and COLLATE

Tags:

php

mysql

utf-8

pdo

I fail to select some entries in a mysql database, comparing french strings with accentuated characters.

Datas samples below are oversimplified for more readability. And I try to be precise, so it's a bit long (sorry).

Context

In database, I have some strings like "année", "annee", "début", "debut", etc. The table and columns are charset utf8 with collate utf8_general_ci.

I'm using MySQL 5.5.30 and PHP 5.4.13 with PDO initializing with charset utf8 :

$this->dbh = new PDO('mysql:host=' . $this->host . ';dbname=' . $this->base . ';charset=utf8', $this->user, $this->pass);

In mysql console client

If I SELECT without any collate precision, like

AND data = :data

with :data = "année". I will get lines with "année" AND "annee".

Then I SELECT with utf8_bin collate precision, like

AND data = :data COLLATE utf8_bin

with :data = "année", and I retrieve only lines with "année". COOL.

With PHP PDO

If I use same SQL in PHP script, with collate precision, I get this error message :

 COLLATION 'utf8_bin' is not valid for CHARACTER SET 'binary'

If I use binary collation, like

AND data = :data COLLATE `binary`

I get the error message

 COLLATION 'binary' is not valid for CHARACTER SET 'utf8'

So, my questions are

In PHP script

  • "How to set explicitely the charset ?" (I believed I did that in PDO string initialization, but it seems to be wrong. "SET NAMES ..." does not work any better in this case)
  • "How can I retrieve correct results with PDO when comparing with accentuated characters?"

Precisions

Here are more details (ask in comments).

*Extract from the "show create table"

CREATE TABLE `Request` (
`id` int(8) NOT NULL AUTO_INCREMENT,
`client_id` int(8) DEFAULT NULL,
....
`domain_version_corrective` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8

*Two sample SELECT (from console, not with PHP / PDO):

SELECT domain_id, domain_version_corrective FROM Request WHERE client_id = 3661 AND domain_version_corrective = 'vèrçion 2.0' ;

which returns

+-----------+---------------------------+
| domain_id | domain_version_corrective |
+-----------+---------------------------+
| FOOBAR    | vercion 2.0               |
| FOOBAR    | vèrcion 2.0               |
| FOOBAR    | verçion 2.0               |
| FOOBAR    | vèrçion 2.0               |
| FOOBAR    | vèrcion 2.0               |
+-----------+---------------------------+

And

SELECT domain_id, domain_version_corrective FROM Request WHERE client_id = 3661 AND domain_version_corrective = 'vèrçion 2.0' COLLATE utf8_bin;

which returns

+-----------+---------------------------+
| domain_id | domain_version_corrective |
+-----------+---------------------------+
| FOOBAR    | vèrçion 2.0               |
+-----------+---------------------------+

*The charsets :

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
like image 528
Armage Avatar asked Jun 12 '26 12:06

Armage


1 Answers

I had the same problem and I succeeded by wrapping the parameter with convert(), like so:

data = convert(:data using utf8) collate utf8_bin

It appears that PDO always tags string parameters with character set BINARY by PDO. It would be really nice to find out how to change this.

Also, I'm not sure if it's actually correct to convert the string to UTF8 like this, I suppose technically it could depend on how PHP represents the string internally.

like image 124
joki Avatar answered Jun 14 '26 02:06

joki



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!