I'm trying to run the following:
import json
path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
records = [json.loads(line) for line in open(path)]
But I get the following error :
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 6987: ordinal not in range(128)
From the internet I've found that it should be because the encoding needs to be set to utf-8, but my issue is that it's already in utf-8.
sys.getdefaultencoding()
Out[43]: 'utf-8'
Also, it looks like my file is in utf-8, so I'm really confused Also, the following code works :
In [15]: path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
In [16]: open(path).readline()
Is there a way to solve this ?
Thanks !
EDIT:
When I run the code in my console it works, but not when I run it in Spyder provided by Anaconda (https://www.continuum.io/downloads)
Do you know what can go wrong ?
The text file contains some non-ascii characters on a line somewhere. Somehow on your setup the default file encoding is set to ascii instead of utf-8 so do the following and specify the file's encoding explicitly:
import json
path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
records = [json.loads(line.strip()) for line in open(path, encoding="utf-8"))]
(Doing this is a good idea anyway even when the default works)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With