Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Chinese localization not worked with PHP gettext extension as it works with English

I'm already localized a website from Russian to English with PHP and gettext just with wrapping all strings into __($string) function.

It works.

Here's the gists: https://gist.github.com/Grawl/ba8f39b8398791c6a67e

But it don't work with Chinese translation. I just added compiled .mo (and source .po) into locale/zh_CN/LC_MESSAGES/, visit /index.php?locale=zh_CN and don't see it translated at all.

What it wrong with Chinese?

Have I to use other language code or something?

I use zh_CN to map on Chinese like it done in WordPress.

I cannot understand why.


Update:

The problem was in HTML <meta> tag and charset going from server in Windows-1251. Chop russian PHP server.

After I set <meta charset="GBK"> and turned off AddDefaultCharset in .htaccess, Chinese localization finally started to work.

After all, I added these modifications:

.htaccess:

- AddDefaultCharset UTF-8
+ AddDefaultCharset off
+ RewriteRule ^cn index.php?locale=zh_CN&charset=GBK [L]

functions.php, included before <!DOCTYPE html>:

+ $charset=$_GET["charset"];
+ if(!isset($charset)) {
+   $charset="UTF-8";
+ }

head.php, the <head> tag content:

+ <meta charset="<?=$charset?>">

So, if I does not set charset into get request, it becomes UTF-8, otherwise it goes from get request. For Chiense I set it to GBK, like on Taobao.com, and browser sets up right charset.

But after all I just has cyrillic characters encoded in Chinese glyphs, character by character.

Like this: Сервис и услуги

Becomes this: 褋械褉胁懈褋 懈 褍褋谢褍谐懈

If you paste these Chinese characters into decoder app, chose GB2312 on left (one from Chinese charsets) and UTF-8 on right, you will have ?е?ви? и ??л?ги – some cyrillic characters corrupted but this is obviously an original string, because in translation I have more shorten 服务 for this phrase.

Help me please.


Update 2

I just forgot to set bind_textdomain_codeset(); to $domain, it was messages.

All works on unicode charset. All normal.

like image 207
Даниил Пронин Avatar asked Jul 03 '15 14:07

Даниил Пронин


1 Answers

Summary

I was able to make this work without changing the <meta charset="..."> value away from utf-8. You should also be able to remove the AddDefaultCharset rule from your .htaccess and also remove the &charset=GBK from your RewriteRule. You need to make sure that your .po file is formatted and compiled correctly, and also make sure that server can find it.

Explanation/Example

Setting the <meta charset="..."> tag only tells the browser what character encoding is being used on the page. PHP still needs to know which file to select to replace strings. And in any case, although this documentation suggests otherwise, I think you can still use UTF-8 to do Chinese localization. Here is a simple working example I set up on my system:

<?php
    // initialize locale-related variables
    $locale     = $_GET['locale'] ?: 'en_US';
    $domain     = 'bridges';
    $locale_dir = dirname( __FILE__ ) . '/locale'; // using absolute path!

    // set up locale
    putenv( "LC_ALL=$locale" );
    setlocale( LC_ALL, $locale );
    bindtextdomain( $domain, $locale_dir );
    bind_textdomain_codeset( $domain, 'UTF-8' );
    textdomain($domain);
?><!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title><?= _( 'Localization Test' ) ?></title>
    </head>
    <body>
        <p><?= _( 'Hello' ) ?>!</p>
    </body>
</html>

My .po file which is located at ./locale/zh_CN/LC_MESSAGES/bridges.po looks like:

msgid ""
msgstr ""
"Project-Id-Version: 1.0\n"
"PO-Revision-Date: 2015-07-20\n"
"Last-Translator: Morgan Benton\n"
"Language-Team: Chinese\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Language: zh_CN\n"

msgid "Localization Test"
msgstr "本土化试"

msgid "Hello"
msgstr "您好"

According to a comment on the gettext() documentation, you should put the character encoding and other relevant headers inside your .po file, e.g.

"Content-Type: text/plain; charset=UTF-8\n"

You can check the syntax of your .po file by running the command msgfmt -c bridges.po -o bridges.mo from your terminal. It will warn you if it thinks anything is wrong with your .po file. As the commenter suggested, I think you do NOT need to have the Chinese system libraries installed.

P.S. I don't know if these Chinese translations are correct or not. This is just what Google Translate gave me! :)

like image 200
morphatic Avatar answered Nov 05 '22 22:11

morphatic