Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Json_encode Charset problem

When I use json_encode to encode my multi lingual strings , It also changes special characters.What should I do to keep them same .

For example

<?
echo json_encode(array('şüğçö'));

It returns something like ["\u015f\u00fc\u011f\u00e7\u00f6"]

But I want ["şüğçö"]

like image 993
Oguz Bilgic Avatar asked Jun 14 '10 06:06

Oguz Bilgic


People also ask

What is the default encoding for a JSON file?

The default encoding is UTF-8. at the pattern of nulls in the first four octets. Accordingly, charset parameter is not allowed in the JSON MIME type specifically because the unicode variant can be determined from the content: The MIME media type for JSON text is application/json. JSON may be represented using UTF-8, UTF-16, or UTF-32.

Why charset parameter is not allowed in JSON?

The default encoding is UTF-8. at the pattern of nulls in the first four octets. Accordingly, charset parameter is not allowed in the JSON MIME type specifically because the unicode variant can be determined from the content: The MIME media type for JSON text is application/json.

Why are my JSON characters being mangled when sending data?

When sending JSON data, any non ISO-8859-1 characters are mangled unless the encoding is specified as UTF-8. Issue #228 initially brought up the issue, but the solution to specify a charset of the content-type, while useful for others, is unnecessary for JSON, and arguably wrong.

Does the JSON data source Reader support Bom encoding?

Thank you. The JSON data source reader is able to automatically detect encoding of input JSON files using BOM at the beginning of the files. However, BOM is not mandatory by Unicode standard and prohibited by RFC 7159.


2 Answers

try it:

<?
echo json_encode(array('şüğçö'), JSON_UNESCAPED_UNICODE);
like image 197
Deka Avatar answered Oct 21 '22 13:10

Deka


In JSON any character in strings may be represented by a Unicode escape sequence. Thus "\u015f\u00fc\u011f\u00e7\u00f6" is semantically equal to "şüğçö".

Although those character can also be used plain, json_encode probably prefers the Unicode escape sequences to avoid character encoding issues.

like image 32
Gumbo Avatar answered Oct 21 '22 11:10

Gumbo