I am using almost the latest version of php (5.5.11) and here is the problem. When I use json_encode
of the part of the string, it returns false. In the beginning I was using substr
, but then I realized that this is totally wrong when dealing with non-English strings. But even after I used mb_substr
I still see that json_encode
returns false
:
$s = "に搭載されるようになると、その手軽さからJは急速に普及していく。、通信に関する標準を策定する国際団体インターナショナル";
$a = mb_substr($s, 0, 10);
As you see,
var_dump( json_encode([
'd' => $a
]) );
returns false
, and
var_dump( json_encode([
'd' => $s
]) );
returns correct json.
When looking into json_last_error, I see that this is due to Malformed UTF-8 characters, possibly incorrectly encoded
. So the problem is that mb_substr gives me malformed characters.
When I look at var_dump($a);
I see that it produces string(10) "に搭載�"
(I assume that each Japanese char is 3 bytes, and that question mark is malformed char).
So how can I get a substring from the string in such a way, that I will not get a malformed string?
Simply pass the utf-8
encoding as the fourth parameter of the mb_substr()
and you are good to go.
$a = mb_substr($s, 0, 10,'utf-8');
echo $a; // に搭載されるようにな
echo json_encode($a); // "\u306b\u642d\u8f09\u3055\u308c\u308b\u3088\u3046\u306b\u306a"
Demonstration
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With