Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError, invalid continuation byte

Why is the below item failing? Why does it succeed with "latin-1" codec?

o = "a test of \xe9 char" #I want this to remain a string as this is what I am receiving v = o.decode("utf-8") 

Which results in:

 Traceback (most recent call last):    File "<stdin>", line 1, in <module>    File "C:\Python27\lib\encodings\utf_8.py",  line 16, in decode      return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError:  'utf8' codec can't decode byte 0xe9 in position 10: invalid continuation byte 
like image 949
RuiDC Avatar asked Apr 05 '11 13:04

RuiDC


People also ask

What is an invalid continuation byte?

The Python "UnicodeDecodeError: 'utf-8' codec can't decode byte in position: invalid continuation byte" occurs when we specify an incorrect encoding when decoding a bytes object. To solve the error, specify the correct encoding, e.g. latin-1 . Here is an example of how the error occurs.

What does UnicodeDecodeError mean in Python?

The Python "UnicodeDecodeError: 'ascii' codec can't decode byte in position" occurs when we use the ascii codec to decode bytes that were encoded using a different codec. To solve the error, specify the correct encoding, e.g. utf-8 .


1 Answers

I had the same error when I tried to open a CSV file by pandas.read_csv method.

The solution was change the encoding to latin-1:

pd.read_csv('ml-100k/u.item', sep='|', names=m_cols , encoding='latin-1') 
like image 111
Mazen Aly Avatar answered Sep 18 '22 10:09

Mazen Aly