$test = json_encode('بسم الله');
echo $test;
As a result of this code, the output is: "\u0628\u0633\u0645 \u0627\u0644\u0644\u0647"
while it should be something like "بسم الله". Arabic Characters are encoded when being JSON encoded while at the Youtube API this is not the case:
http://gdata.youtube.com/feeds/api/videos/RqMxTnTZeNE?v=2&alt=json
You can see at Youtube that Arabic characters are displayed properly. What could be my mistake?
HINT: I'm working on an API< the example is just for the sake of clarification.
So yes, JSON does support it.
All Arabic characters can be encoded using a single UTF-16 code unit (2 bytes), but they may take either 2 or 3 UTF-8 code units (1 byte each), so if you were just encoding Arabic, UTF-16 would be a more space efficient option.
The JSON spec requires UTF-8 support by decoders. As a result, all JSON decoders can handle UTF-8 just as well as they can handle the numeric escape sequences. This is also the case for Javascript interpreters, which means JSONP will handle the UTF-8 encoded JSON as well.
JSON data always uses the Unicode character set.
"\u0628\u0633\u0645 \u0627\u0644\u0644\u0647"
and "بسم الله"
are equivalent in JSON.
PHP just defaults to using Unicode escapes instead of literals for multibyte characters.
You can specify otherwise with JSON_UNESCAPED_UNICODE (providing you are using PHP 5.4 or later).
json_encode('بسم الله', JSON_UNESCAPED_UNICODE);
That is the correct JSON encoded version of the UTF-8 string. There is no need to change it, it represents the correct string. Characters in JSON can be escaped this way.
JSON can represent UTF-8 characters directly if you want to. Since PHP 5.4 you have the option to set the JSON_UNESCAPED_UNICODE
flag to produce raw UTF-8 strings:
json_encode($string, JSON_UNESCAPED_UNICODE)
But that is only a preference, it is not necessary.
Both formats are valid and equivalent JSON strings:
char any-Unicode-character- except-"-or-\-or- control-character \" \\ \/ \b \f \n \r \t \u four-hex-digits
If you prefer the unencoded version, simply add the JSON_UNESCAPED_UNICODE
flag:
<?php
$test = json_encode('بسم الله', JSON_UNESCAPED_UNICODE);
echo $test;
This flag requires PHP/5.4.0 or greater.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With