Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resolving incorrect character encoding when displaying MySQL database results after upgrade to PHP 5.3

Issue Description

After upgrading PHP on our development server from 5.2 to 5.3, we're encountering an issue where data requested from our database and displayed on a web page shows with improper encoding when attempting to display Russian characters.

Environment

  • Dev OS: Debian GNU/Linux 6.0
  • Dev PHP: 5.3.5-0.dotdeb.1
  • Live MySQL: Distrib 5.1.49

Details

In PHP 5.3, the default client library for interacting with MySQL databases changed from libmysql to mysqlnd, which would appear to be the cause of the issue we are encountering.

We are connecting to the database with the following code:

$conn = mysql_pconnect('database.hostname', 'database_user', 'database_password');
$mysql_select_db('database', $conn);

The data stored in our database is encoded with UTF-8 encoding. Connecting to the database via the command-line client and running queries confirms that the data is intact and encoded properly. However, when we query the database in PHP and try to display the exact same data, it becomes garbled. In this specific case, we're attempting to display Russian characters and the result is non-English, non-Russian characters: garbled mess

The response headers we receive confirm that the content-type is UTF-8:

response headers

We tested the strings before display with mb_detect_encoding in strict mode as well as mb_check_encoding and were told the string was a UTF-8 string before displaying it. We also used mysql_client_encoding to test the client encoding and it also indicates the character set is UTF-8.

In performing research, we discovered some suggestions to try to work around this issue:

header("Content-type: text/html; charset=utf-8");
mysql_set_charset('utf8');
mysql_query("SET SESSION character_set_results = 'UTF8'");
mysql_query('SET NAMES UTF8', $conn);

We even tried utf8_encode:

utf8_encode($string);

However, none of these solutions worked.

Running out of options, we upgraded MySQL on our development system to Distrib 5.1.55. After that upgrade, everything displayed correctly when we connected to our development database. Of course, it continues to display incorrectly when we connect to our live database.

Ideally, we would like to resolve this issue without upgrading MySQL on our production servers unless we can verify the exact reason why this isn't working and why the upgrade will fix it. How can we resolve this encoding issue without upgrading MySQL? Alternatively, why does the MySQL upgrade fix the issue?

like image 842
Shaun Avatar asked Mar 03 '11 23:03

Shaun


2 Answers

I see you've tried this, but the syntax I use is: mysql_query("SET NAMES utf8"). Your syntax may be correct, I've just never seen it like that before.

Example:

// connect to database stuff
$Connection = mysql_connect($server, $username, $password)
or die ("Error connecting to server");

// connect to database stuff
$db = mysql_select_db($database, $Connection)
or die ("Error selecting database");

mysql_query("SET NAMES utf8");
like image 111
Jarrod Avatar answered Oct 18 '22 17:10

Jarrod


If you have made sure that both the tables, and the output encoding are UTF-8, almost the only thing left is the connection encoding.

The reason for the change in behaviour when updating servers could be a change of the default connection encoding:

[mysql]
default-character-set=utf8

However, I can't see any changes in the default encoding between versions, so if those were brand-new installs, I can't see that happening.

Anyway, what happens if you run this from within your PHP query and output the results. Any differences to the command line output?

 SHOW VARIABLES LIKE 'character_set%';
 SHOW VARIABLES LIKE 'collation%'; 
like image 38
Pekka Avatar answered Oct 18 '22 17:10

Pekka