Python 3: JSON File Load with Non-ASCII Characters

Question

just trying to load this JSON file(with non-ascii characters) as a python dictionary with Unicode encoding but still getting this error:

return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 90: ordinal not in range(128)

JSON file content = "tooltip":{ "dxPivotGrid-sortRowBySummary": "Sort\"{0}\"byThisRow",}

import sys  
import json

data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
    for line in f:
        data.append(json.loads(line.encode('utf-8','replace')))

tdelaney · Accepted Answer

You have several problems as near as I can tell. First, is the file encoding. When you open a file without specifying an encoding, the file is opened with whatever sys.getfilesystemencoding() is. Since that may vary (especially on Windows machines) its a good idea to explicitly use encoding="utf-8" for most json files. Because of your error message, I suspect that the file was opened with an ascii encoding.

Next, the file is decoded from utf-8 into python strings as it is read by the file system object. The utf-8 line has already been decoded to a string and is already ready for json to read. When you do line.encode('utf-8','replace'), you encode the line back into a bytes object which the json loads (that is, "load string") can't handle.

Finally, "tooltip":{ "navbar":"Operações de grupo"} isn't valid json, but it does look like one line of a pretty-printed json file containing a single json object. My guess is that you should read the entire file as 1 json object.

Putting it all together you get:

import json

with open('/Users/myvb/Desktop/Automation/pt-PT.json', encoding="utf-8") as f:
    data = json.load(f)

From its name, its possible that this file is encoded as a Windows Portugese code page. If so, the "cp860" encoding may work better.

Python 3: JSON File Load with Non-ASCII Characters

Tags:

python

json

python-3.x

min2bro

1 Answers

tdelaney

Recent Activity

Donate For Us

Python 3: JSON File Load with Non-ASCII Characters

Tags:

python

json

python-3.x

min2bro

1 Answers

tdelaney

Related questions

Recent Activity

Donate For Us