I apologize if this question has been asked earlier. I am still not clear about encoding in python3.2.
I am reading a csv(encoded in UTF-8 w/o BOM) and I have French accents in the csv.
Here is the code to opening and reading the csv file:
csvfile = open(in_file, 'r', encoding='utf-8')
fieldnames = ("id","locale","message")
reader = csv.DictReader(csvfile,fieldnames,escapechar="\\")
for row in reader:
if row['id'] == id and row['locale'] == locale:
out = row['message'];
I am returning the message(out) as Json
jsonout = json.dumps(out, ensure_ascii=True)
return HttpResponse(jsonout,content_type="application/json; encoding=utf-8")
However when I preview the result I get the accent e(French) being replaced by \u00e9 .
Can you please advice on what I am doing wrong and what should I do so that the json output shows the proper e with accent.
Thanks
You're doing nothing wrong (and neither is Python).
Python's json module simply takes the safe route and escapes non-ascii characters. This is a valid way of representing such characters in json, and any conforming parser will resurrect the proper Unicode characters when parsing the string:
>>> import json
>>> json.dumps({'Crêpes': 5})
'{"Cr\\u00eapes": 5}'
>>> json.loads('{"Cr\\u00eapes": 5}')
{'Crêpes': 5}
Don't forget that json is just a representation of your data, and both "ê"
and "\\u00ea"
are valid json representations of the string ê
. Conforming json parsers should handle both correctly.
It is possible to disable this behaviour though, see the json.dump
documentation:
>>> json.dumps({'Crêpes': 5}, ensure_ascii=False)
'{"Crêpes": 5}'
In respect to this answer, setting ensure_ascii=False
renders the special characters in your printouts. On the other hand, marcelm's answer is still correct, as no information is lost in those encodings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With