I have a dictionary data
where I have stored:
key
- ID of an event
value
- the name of this event, where value
is a UTF-8 string
Now, I want to write down this map into a json file. I tried with this:
with open('events_map.json', 'w') as out_file:
json.dump(data, out_file, indent = 4)
but this gives me the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
Now, I also tried with:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file:
out_file.write(unicode(json.dumps(data, encoding="utf-8")))
but this raises the same error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
I also tried with:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file:
out_file.write(unicode(json.dumps(data, encoding="utf-8", ensure_ascii=False)))
but this raises the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbf in position 3114: ordinal not in range(128)
Any suggestions about how can I solve this problem?
EDIT: I believe this is the line that is causing me the problem:
> data['142']
'\xbf/ANCT25'
EDIT 2:
The data
variable is read from a file. So, after reading it from a file:
data_file_lines = io.open(file_name, 'r', encoding='utf8').readlines()
I then do:
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file:
json.dump(data, json_file, ensure_ascii=False)
Which gives me the error:
TypeError: must be unicode, not str
Then, I try to do this with the data dictionary:
for tuple in sorted_tuples (the `data` variable is initialized by a tuple):
data[str(tuple[1])] = json.dumps(tuple[0], ensure_ascii=False, encoding='utf8')
which is, again, followed by:
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file:
json.dump(data, json_file, ensure_ascii=False)
but again, the same error:
TypeError: must be unicode, not str
I get the same error when I use the simple open
function for reading from the file:
data_file_lines = open(file_name, "r").readlines()
The Python "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte" occurs when we specify an incorrect encoding when decoding a bytes object. To solve the error, specify the correct encoding, e.g. utf-16 or open the file in binary mode ( rb or wb ).
loads() json. loads() method can be used to parse a valid JSON string and convert it into a Python Dictionary. It is mainly used for deserializing native string, byte, or byte array which consists of JSON data into Python Dictionary.
The exception is caused by the contents of your data
dictionary, at least one of the keys or values is not UTF-8 encoded.
You'll have to replace this value; either by substituting a value that is UTF-8 encoded, or by decoding it to a unicode
object by decoding just that value with whatever encoding is the correct encoding for that value:
data['142'] = data['142'].decode('latin-1')
to decode that string as a Latin-1-encoded value instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With