Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP/MySQL with encoding problems

I am having trouble with PHP regarding encoding.

I have a JavaScript/jQuery HTML5 page interact with my PHP script using $.post. However, PHP is facing a weird problem, probably related to encoding.

When I write

htmlentities("í")

I expect PHP to output í. However, instead it outputs í At the beginning, I thought that I was making some mistake with the encodings, however

htmlentities("í")=="í"?"Good":"Fail";

is outputing "Fail", where

htmlentities("í")=="í"?"Good":"Fail";

But htmlentities($search, null, "utf-8") works as expected.

I want to have PHP communicate with a MySQL server, but it has encoding problems too, even if I use utf8_encode. What should I do?

EDIT: On the SQL command, writing

SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

where XXX contains no í chars, works as expected, but it does not if there is any 'í' char.

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

Not only fails for í chars, but it ALSO fails for strings without any 'special' characters. Removing the ' chars from SET NAMES and SET CHARACTER SET doesn't seem to change anything.

I am connecting to the MySQL database using PDO.

EDIT 2: I am using MySQL version 5.1.30 of XAMPP for Linux.

EDIT 3: Running SHOW VARIABLES LIKE '%character%' from PhpMyAdmin outputs

character_set_client    utf8
character_set_connection    utf8
character_set_database  latin1
character_set_filesystem    binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /opt/lampp/share/mysql/charsets/

Running the same query from my PHP script(with print_r) outputs:

Array
(
    [0] => Array
        (
            [Variable_name] => character_set_client
            [0] => character_set_client
            [Value] => latin1
            [1] => latin1
        )

    [1] => Array
        (
            [Variable_name] => character_set_connection
            [0] => character_set_connection
            [Value] => latin1
            [1] => latin1
        )

    [2] => Array
        (
            [Variable_name] => character_set_database
            [0] => character_set_database
            [Value] => latin1
            [1] => latin1
        )

    [3] => Array
        (
            [Variable_name] => character_set_filesystem
            [0] => character_set_filesystem
            [Value] => binary
            [1] => binary
        )

    [4] => Array
        (
            [Variable_name] => character_set_results
            [0] => character_set_results
            [Value] => latin1
            [1] => latin1
        )

    [5] => Array
        (
            [Variable_name] => character_set_server
            [0] => character_set_server
            [Value] => latin1
            [1] => latin1
        )

    [6] => Array
        (
            [Variable_name] => character_set_system
            [0] => character_set_system
            [Value] => utf8
            [1] => utf8
        )

    [7] => Array
        (
            [Variable_name] => character_sets_dir
            [0] => character_sets_dir
            [Value] => /opt/lampp/share/mysql/charsets/
            [1] => /opt/lampp/share/mysql/charsets/
        )

)

Running

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SHOW VARIABLES LIKE '%character%'

outputs an empty array.

like image 338
luiscubal Avatar asked Jan 01 '09 23:01

luiscubal


People also ask

Does MySQL support UTF-8?

MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character.

Does PHP support UTF-8?

PHP | utf8_encode() FunctionThe utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8. Unicode has been developed to describe all possible characters of all languages and includes a lot of symbols with one unique number for each symbol/character.


1 Answers

It's very important to specify the encoding of htmlentities to match that of the input, as you did in your final example but omitted in the first three.

htmlentities($text,ENT_COMPAT,'utf-8');

Regarding communications with MySQL, you need to make sure the connection collation and character set matches the data you are transmitting. You can either set this in the configuration file, or at runtime using the following queries:

SET NAMES utf8;
SET CHARACTER SET utf8;

Make sure the table, database and server character sets match as well. There is one setting you can't change at run-time, and that's the server's character set. You need to modify it in the configuration file:

[mysqld]
character-set-server = utf8
default-character-set = utf8 
skip-character-set-client-handshake

Read more on characters sets and collations in MySQL in the manual.

like image 164
Eran Galperin Avatar answered Oct 25 '22 11:10

Eran Galperin