Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'utf8' codec can't decode byte 0xf3

I am using python 2.7 to read the json file. my code is

import json
from json import JSONDecoder
import os

path=os.path.dirname(os.path.abspath(__file__))+'/json'
print path
for root, dirs, files in os.walk(os.path.dirname(path+'/json')):
    for f in files:  
        if f.lower().endswith((".json")):
            fp=open(root + '/'+f)
            data = fp.read()
            print data.decode('utf-8')

But I got the following error.

UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 72: invalid
continuation byte
like image 368
Rajiv Avatar asked Jun 23 '15 07:06

Rajiv


People also ask

What is unicodedecodeerror 'UTF-8' codec can't decode?

The Python "UnicodeDecodeError: 'utf-8' codec can't decode byte in position: invalid continuation byte" occurs when we specify an incorrect encoding when decoding a bytes object. To solve the error, specify the correct encoding, e.g. latin-1.

What is unicodedecodeerror 0xF3?

UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 72: invalid continuation byte JSON is defined to use UTF-8 but a lone 0xF3 byte is not valid in a UTF-8 multibyte sequence. Your file is not valid UTF-8.

Can't decode byte in position invalid continuation byte Python?

The Python "UnicodeDecodeError: 'utf-8' codec can't decode byte in position: invalid continuation byte" occurs when we specify an incorrect encoding when decoding a bytes object. To solve the error, specify the correct encoding, e.g. latin-1. Here is an example of how the error occurs.

Why is my JSON file not valid in UTF-8?

JSON is defined to use UTF-8 but a lone 0xF3 byte is not valid in a UTF-8 multibyte sequence. Your file is not valid UTF-8. A common workaround is to force a different encoding, commonly 'latin-1', but this will basically create incorrect results instead. If you know the actual encoding of the input, by all means use that.


1 Answers

Your file is not encoded in UTF-8, and the error occurs at the fp.read() line. You must use:

import io
io.open(filename, encoding='latin-1')

And the correct, not platform-dependent usage for joining your paths is:

os.path.join(root, f)
like image 165
Arcturus B Avatar answered Sep 19 '22 00:09

Arcturus B