Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem json_encode utf-8 [duplicate]

I have a problem with json_encode function with special characters.

For example I try this:

$string="Svrček";

echo "ENCODING=".mb_detect_encoding($string); //ENCODING=UTF-8

echo "JSON=".json_encode($string); //JSON="Svr\u010dek"

What can I do to display the string correctly, so JSON="Svrček"?

Thank you very much.

like image 239
epi82 Avatar asked May 19 '11 12:05

epi82


People also ask

What does json_encode mean?

The json_encode() function is used to encode a value to JSON format.

What does json_encode return?

The json_encode() function can return a string containing the JSON representation of supplied value. The encoding is affected by supplied options, and additionally, the encoding of float values depends on the value of serialize_precision.

Is JSON always UTF-8?

JSON decoders always assume UTF-8, even the PHP implementation, even though PHP doesn't normally assume UTF-8 in many other functions.

Can JSON handle UTF-8?

The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).


1 Answers

json_encode() is not actually outputting JSON* there. It’s outputting a javascript string. (It outputs JSON when you give it an object or an array to encode.) That’s fine, as a javascript string is what you want.

In javascript (and in JSON), č may be escaped as \u010d. The two are equivalent. So there’s nothing wrong with what json_encode() is doing. It should work fine. I’d be very surprised if this is actually causing you any form of problem. However, if the transfer is safely in a Unicode encoding (UTF-8, usually)†, there’s no need for it either. If you want to turn off the escaping, you can do so thus: json_encode('Svrček', JSON_UNESCAPED_UNICODE). Note that the flag JSON_UNESCAPED_UNICODE was introduced in PHP 5.4.0, and is unavailable in earlier versions.

By the way, contrary to what @onteria_ says, JSON does use UTF-8:

The character encoding of JSON text is always Unicode. UTF-8 is the only encoding that makes sense on the wire, but UTF-16 and UTF-32 are also permitted.


* Or, at least, it's not outputting JSON as defined in RFC 4627. However, there are other definitions of JSON, by which scalar values are allowed.

† JSON may be in UTF-8, UTF-16LE, UTF-16BE, UFT-32LE, or UTF-32BE.

like image 56
TRiG Avatar answered Oct 09 '22 16:10

TRiG