Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't make (UTF-8) traditional Chinese character to work in PHP gettext extension (.po and .mo files created in poEdit)

I checked MSDN and the locale string is zh_Hant, but I also tried with zh_TW (Chinese, Taiwan).

The traditional Chinese characters look OK in the poEditor, but when I open the file in the browser the characters are just weird symbols («¢Åo¥@¬É!). I think the translation is working, but there's something wrong with the encoding (I used UTF-8 for both Charset and Source Code Charset).

The files generated with poEditor:

messages.po:

msgid ""
msgstr ""
"Project-Id-Version: \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-02-15 16:26+0800\n"
"PO-Revision-Date: 2010-02-15 16:26+0800\n"
"Last-Translator: Jano Chen <[email protected]>\n"
"Language-Team: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Poedit-KeywordsList: _;gettext;gettext_noop\n"
"X-Poedit-Basepath: C:\\wamp\\www\\php-test\n"
"X-Poedit-Language: Chinese\n"
"X-Poedit-Country: TAIWAN\n"
"X-Poedit-SourceCharset: utf-8\n"
"X-Poedit-SearchPath-0: .\n"

#: test.php:3
msgid "Hello World!"
msgstr "哈囉世界!"

PS: When I change the encoding display in Firefox to Big5 the characters are shown properly, but if I change them to UTF-8 it shows: ���o�@��!.

like image 744
alexchenco Avatar asked Feb 15 '10 08:02

alexchenco


People also ask

Is Simplified Chinese UTF-8?

UTF-8 is the default text encoding today. If you are hosting a site that will be accessed by anyone other than mainland Chinese, you will be expected to use this encoding. It requires three bytes for most simplified Chinese characters.

Can UTF-8 handle Chinese characters?

UTF-8 is a character encoding system. It lets you represent characters as ASCII text, while still allowing for international characters, such as Chinese characters. As of the mid 2020s, UTF-8 is one of the most popular encoding systems.

Does PHP support utf8?

The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8. Unicode has been developed to describe all possible characters of all languages and includes a lot of symbols with one unique number for each symbol/character.


1 Answers

I finally solved it. I had to have the following to file localization.php.

bind_textdomain_codeset("messages", 'UTF-8');
like image 117
alexchenco Avatar answered Sep 21 '22 01:09

alexchenco