I want to dump a dictionary to file, like in Dump Python dictionary to JSON file. But I faced with a problem with encoding: When I simply do
print(data)
I get something like this in terminal:
{'legend': '\n\r\n\t\tНа прямой расположены стойла, в которые необходимо расставить коров так, чтобы минимальное расcтояние между коровами было как можно больше.\r\n \n', 'input_specification': '\n\r\n Входные данные\r\n \n\r\n\t\tВ первой строке вводятся числа N\xa0 (2 < N < 10001) – количество стойл и K\xa0 (1 < K < N ) – количество коров. Во второй строке\xa0задаются N натуральных чисел в порядке возрастания – координаты стойл (координаты не превосходят 109)\r\n \n', 'output_specification': '\n\r\n Выходные данные\r\n \n\r\n\t\tВыведите одно число – наибольшее возможное допустимое расстояние.\r\n \n'}
So it is normal, human-readable text. But when I dump the same dictionary to some json file this way:
with open('Data\{0}.json'.format(i), 'w') as file:
json.dump(data, file)
There is strange mess of special characters in the file:
{"legend": "\n\r\n\t\t\u041d\u0430 \u043f\u0440\u044f\u043c\u043e\u0439 \u0440\u0430\u0441\u043f\u043e\u043b\u043e\u0436\u0435\u043d\u044b \u0441\u0442\u043e\u0439\u043b\u0430, \u0432 \u043a\u043e\u0442\u043e\u0440\u044b\u0435 \u043d\u0435\u043e\u0431\u0445\u043e\u0434\u0438\u043c\u043e \u0440\u0430\u0441\u0441\u0442\u0430\u0432\u0438\u0442\u044c \u043a\u043e\u0440\u043e\u0432 \u0442\u0430\u043a, \u0447\u0442\u043e\u0431\u044b \u043c\u0438\u043d\u0438\u043c\u0430\u043b\u044c\u043d\u043e\u0435 \u0440\u0430\u0441c\u0442\u043e\u044f\u043d\u0438\u0435 \u043c\u0435\u0436\u0434\u0443 \u043a\u043e\u0440\u043e\u0432\u0430\u043c\u0438 \u0431\u044b\u043b\u043e \u043a\u0430\u043a \u043c\u043e\u0436\u043d\u043e \u0431\u043e\u043b\u044c\u0448\u0435.\r\n \n", "input_specification": "\n\r\n \u0412\u0445\u043e\u0434\u043d\u044b\u0435 \u0434\u0430\u043d\u043d\u044b\u0435\r\n \n\r\n\t\t\u0412 \u043f\u0435\u0440\u0432\u043e\u0439 \u0441\u0442\u0440\u043e\u043a\u0435 \u0432\u0432\u043e\u0434\u044f\u0442\u0441\u044f \u0447\u0438\u0441\u043b\u0430 N\u00a0 (2 < N < 10001) \u2013 \u043a\u043e\u043b\u0438\u0447\u0435\u0441\u0442\u0432\u043e \u0441\u0442\u043e\u0439\u043b \u0438 K\u00a0 (1 < K < N ) \u2013 \u043a\u043e\u043b\u0438\u0447\u0435\u0441\u0442\u0432\u043e \u043a\u043e\u0440\u043e\u0432. \u0412\u043e \u0432\u0442\u043e\u0440\u043e\u0439 \u0441\u0442\u0440\u043e\u043a\u0435\u00a0\u0437\u0430\u0434\u0430\u044e\u0442\u0441\u044f N \u043d\u0430\u0442\u0443\u0440\u0430\u043b\u044c\u043d\u044b\u0445 \u0447\u0438\u0441\u0435\u043b \u0432 \u043f\u043e\u0440\u044f\u0434\u043a\u0435 \u0432\u043e\u0437\u0440\u0430\u0441\u0442\u0430\u043d\u0438\u044f \u2013 \u043a\u043e\u043e\u0440\u0434\u0438\u043d\u0430\u0442\u044b \u0441\u0442\u043e\u0439\u043b (\u043a\u043e\u043e\u0440\u0434\u0438\u043d\u0430\u0442\u044b \u043d\u0435 \u043f\u0440\u0435\u0432\u043e\u0441\u0445\u043e\u0434\u044f\u0442 109)\r\n \n", "output_specification": "\n\r\n \u0412\u044b\u0445\u043e\u0434\u043d\u044b\u0435 \u0434\u0430\u043d\u043d\u044b\u0435\r\n \n\r\n\t\t\u0412\u044b\u0432\u0435\u0434\u0438\u0442\u0435 \u043e\u0434\u043d\u043e \u0447\u0438\u0441\u043b\u043e \u2013 \u043d\u0430\u0438\u0431\u043e\u043b\u044c\u0448\u0435\u0435 \u0432\u043e\u0437\u043c\u043e\u0436\u043d\u043e\u0435 \u0434\u043e\u043f\u0443\u0441\u0442\u0438\u043c\u043e\u0435 \u0440\u0430\u0441\u0441\u0442\u043e\u044f\u043d\u0438\u0435.\r\n \n"}
I tried to specify ensure_ascii=False like here: Python Saving JSON Files as UTF-8, but it throws UnicodeEncodeError:
UnicodeEncodeError: 'charmap' codec can't encode characters in position 11-12: character maps to <undefined>
All in all, how can I dump dictionary to JSON file without messing up with encoding?
You can convert a dictionary to a JSON string using the json. dumps() method. The process of encoding the JSON is usually called serialization. That term refers to transforming data into a series of bytes (hence serial) stored or transmitted across the network.
To Convert dictionary to JSON you can use the json. dumps() which converts a dictionary to str object, not a json(dict) object! so you have to load your str into a dict to use it by using json.
The JSON spec requires UTF-8 support by decoders. As a result, all JSON decoders can handle UTF-8 just as well as they can handle the numeric escape sequences. This is also the case for Javascript interpreters, which means JSONP will handle the UTF-8 encoded JSON as well.
json. dump() method used to write Python serialized object as JSON formatted data into a file. json. dumps() method is used to encodes any Python object into JSON formatted String.
You need to open the file specifying the file encoding.
with open('Data{0}.json'.format(1), 'w', encoding='utf-8') as file:
json.dump(data, file, ensure_ascii=False)
This way I have dumped your example data sucessfully.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With