Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF8 issues PHP -> MySQL. Getting question marks in database?

Tags:

php

mysql

utf-8

OK, I am currently in PHP/MySQL/UTF-8/Unicode hell!

My environment: MySQL: 5.1.53 Server characterset: latin1 Db characterset: latin1 Client characterset: latin1 Conn. characterset: latin1

PHP: 5.3.3

My PHP files are saved as UTF-8 format, not ASCII files.

In my PHP code when I make the database connection I do the following:

ini_set('default_charset', 'utf-8');
$my_db = mysql_connect(DEV_DB, DEV_USER, DEV_PASS);
mysql_select_db(MY_DB);
// I have tried both of the following utf8 connection functions
// mysql_query("SET NAMES 'utf8'", $my_db);
mysql_set_charset('utf8', $my_db);
// Detect if form value is not UTF-8
if (mb_detect_encoding($_POST['lang_desc']) == 'UTF-8') {
$lang_description = $_POST['lang_desc'];
} else {
$lang_description = utf8_encode($_POST['lang_desc']);
}
$language_sql = sprintf(
'INSERT INTO app_languages (language_id, app_id, description) VALUES (%d, %d, "%s")',
                            intval($lang_data['lang_id']),
                            intval($new_app_id),
                            mysql_real_escape_string($lang_description, $my_db)
);

The format/create of my MySQL database is:

CREATE TABLE IF NOT EXISTS app_languages ( language_id int(10) unsigned NOT NULL, app_id int(10) unsigned NOT NULL, description tinytext collate utf8_unicode_ci, PRIMARY KEY (language_id,app_id) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

The SQL statements that are generated from my PHP code look like this:

INSERT INTO app_languages (language_id, app_id, description) VALUES (91, 2055, "阿拉伯体育新闻和信息")
INSERT INTO app_languages (language_id, app_id, description) VALUES (26, 2055, "阿拉伯體育新聞和信息")
INSERT INTO app_languages (language_id, app_id, description) VALUES (56, 2055, "בערבית ספורט חדשות ומידע")
INSERT INTO app_languages (language_id, app_id, description) VALUES (69, 2055, "アラビア語のスポーツニュースと情報")

Yet, the output appears in my database as this:

|          69 |   2055 | ?????????????????                               |
|          56 |   2055 | ?????? ????? ????? ?????                        |
|          28 |   2055 | Arapski sportske vijesti i informacije          |
|          42 |   2055 | Arabe des nouvelles sportives et d\'information |
|          91 |   2055 | ??????????                                      |

What am I doing wrong??

P.S. We can use Putty to SSH directly to the database server and via the command line Paste one of the unicode/multi-lingual insert statements. And they work successfully!?

Thanks for any light you can shed on this, it's driving me mad.

Cheers, Jason

like image 960
Jason Avatar asked Dec 16 '10 12:12

Jason


People also ask

How to enable UTF-8 in PHP?

The first thing you need to do is to modify your php. ini file to use UTF-8 as the default character set: default_charset = "utf-8"; (Note: You can subsequently use phpinfo() to verify that this has been set properly.)

What is UTF-8 in PHP?

Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.


1 Answers

try to execute the following query after you selected the db:

SET NAMES 'utf8'

this query should solve the problem with different charsets in your files and the db.

felix

like image 110
Felix Geenen Avatar answered Oct 05 '22 22:10

Felix Geenen