OK, I am currently in PHP/MySQL/UTF-8/Unicode hell!
My environment: MySQL: 5.1.53 Server characterset: latin1 Db characterset: latin1 Client characterset: latin1 Conn. characterset: latin1
PHP: 5.3.3
My PHP files are saved as UTF-8 format, not ASCII files.
In my PHP code when I make the database connection I do the following:
ini_set('default_charset', 'utf-8');
$my_db = mysql_connect(DEV_DB, DEV_USER, DEV_PASS);
mysql_select_db(MY_DB);
// I have tried both of the following utf8 connection functions
// mysql_query("SET NAMES 'utf8'", $my_db);
mysql_set_charset('utf8', $my_db);
// Detect if form value is not UTF-8
if (mb_detect_encoding($_POST['lang_desc']) == 'UTF-8') {
$lang_description = $_POST['lang_desc'];
} else {
$lang_description = utf8_encode($_POST['lang_desc']);
}
$language_sql = sprintf(
'INSERT INTO app_languages (language_id, app_id, description) VALUES (%d, %d, "%s")',
intval($lang_data['lang_id']),
intval($new_app_id),
mysql_real_escape_string($lang_description, $my_db)
);
The format/create of my MySQL database is:
CREATE TABLE IF NOT EXISTS
app_languages
(language_id
int(10) unsigned NOT NULL,app_id
int(10) unsigned NOT NULL,description
tinytext collate utf8_unicode_ci, PRIMARY KEY (language_id
,app_id
) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The SQL statements that are generated from my PHP code look like this:
INSERT INTO app_languages (language_id, app_id, description) VALUES (91, 2055, "阿拉伯体育新闻和信息")
INSERT INTO app_languages (language_id, app_id, description) VALUES (26, 2055, "阿拉伯體育新聞和信息")
INSERT INTO app_languages (language_id, app_id, description) VALUES (56, 2055, "בערבית ספורט חדשות ומידע")
INSERT INTO app_languages (language_id, app_id, description) VALUES (69, 2055, "アラビア語のスポーツニュースと情報")
Yet, the output appears in my database as this:
| 69 | 2055 | ????????????????? |
| 56 | 2055 | ?????? ????? ????? ????? |
| 28 | 2055 | Arapski sportske vijesti i informacije |
| 42 | 2055 | Arabe des nouvelles sportives et d\'information |
| 91 | 2055 | ?????????? |
What am I doing wrong??
P.S. We can use Putty to SSH directly to the database server and via the command line Paste one of the unicode/multi-lingual insert statements. And they work successfully!?
Thanks for any light you can shed on this, it's driving me mad.
Cheers, Jason
The first thing you need to do is to modify your php. ini file to use UTF-8 as the default character set: default_charset = "utf-8"; (Note: You can subsequently use phpinfo() to verify that this has been set properly.)
Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.
try to execute the following query after you selected the db:
SET NAMES 'utf8'
this query should solve the problem with different charsets in your files and the db.
felix
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With