I'm switching from Python 2 to 3
In my jupyter notebook the code is
file = "./data/test.json"
with open(file) as data_file:
data = json.load(data_file)
It used to be fine with python 2, but now after just switch to python 3, it gives me the error
UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 123: illegal multibyte sequence
The test.json
file is like this:
[{
"name": "Daybreakers",
"detail_url": "http://www.movieinsider.com/m4120/daybreakers/",
"movie_tt_id": "中文"
}]
If I delete the chinese, there will be no error.
So what should I do?
There are a lot of similar questions in SO, but I didn't find a good solution for my case. If you find an applicable one, please tell me and I'll close this one.
Thanks a lot!
You need to specify the correct encoding when you open the file. If the JSON is encoded with UTF-8 you can do this:
import json
fname = "test.json"
with open(fname, encoding='utf-8') as data_file:
data = json.load(data_file)
print(data)
output
[{'name': 'Daybreakers', 'detail_url': 'http://www.movieinsider.com/m4120/daybreakers/', 'movie_tt_id': '中文'}]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With