Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP - json_encode(string, JSON_UNESCAPED_UNICODE) not escaping czech chars

I'm selecting some data from database and encoding them as json, but I've got a problem with czech signs like

á,í,ř,č,ž...

My file is in utf-8 encoding, my database is also in utf-8 encoding, I've set header to utf-8 encoding as well. What else should I do please?

My code:

header('Content-Type: text/html; charset=utf-8');
while($tmprow = mysqli_fetch_array($result)) {
        $row['user'] = mb_convert_encoding($tmprow['user'], "UTF-8", "auto");
        $row['package'] = mb_convert_encoding($tmprow['package'], "UTF-8", "auto");
        $row['url'] = mb_convert_encoding($tmprow['url'], "UTF-8", "auto");
        $row['rating'] = mb_convert_encoding($tmprow['rating'], "UTF-8", "auto");

        array_push($response, $row);
    }

    $json = json_encode($response, JSON_UNESCAPED_UNICODE);

    if(!$json) {
        echo "error";
    }

and part of the printed json: "package":"zv???tkanalouce"

EDIT: Without mb_convert_encoding() function the printed string is empty and "error" is printed.

like image 622
Jakub Turcovsky Avatar asked Apr 27 '14 10:04

Jakub Turcovsky


1 Answers

With the code you've got in your example, the output is:

json_encode($response, JSON_UNESCAPED_UNICODE);
"package":"zv???tkanalouce"

You see the question marks in there because they have been introduced by mb_convert_encoding. This happens when you use encoding detection ("auto" as third parameter) and that encoding detection is not able to handle a character in the input, replacing it with a question mark. Exemplary line of code:

$row['url'] = mb_convert_encoding($tmprow['url'], "UTF-8", "auto");

This also means that the data coming out of your database is not UTF-8 encoded because mb_convert_encoding($buffer, 'UTF-8', 'auto'); does not introduce question marks if $buffer is UTF-8 encoded.

Therefore you need to find out which charset is used in your database connection because the database driver will convert strings into the encoding of the connection.

Most easy is that you just tell per that database link that you're asking for UTF-8 strings and then just use them:

$mysqli = new mysqli("localhost", "my_user", "my_password", "test");

/* check connection */
if (mysqli_connect_errno()) {
    printf("Connect failed: %s\n", mysqli_connect_error());
    exit();
}

/* change character set to utf8 */
if (!$mysqli->set_charset("utf8")) {
    printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
    printf("Current character set: %s\n", $mysqli->character_set_name());
}

The previous code example just shows how to set the default client character set to UTF-8 with mysqli. It has been taken from the manual, see as well the material we have on site about that, e.g. utf 8 - PHP and MySQLi UTF8.

You can then greatly improve your code:

$response = $result->fetch_all(MYSQLI_ASSOC);

$json = json_encode($response, JSON_UNESCAPED_UNICODE);

if (FALSE === $json) {
    throw new LogicException(
        sprintf('Not json: %d - %s', json_last_error(), json_last_error_msg())
    );
}

header('Content-Type: application/json'); 
echo $json;
like image 66
hakre Avatar answered Oct 28 '22 06:10

hakre