'utf-8' codec can't decode byte reading a file in Python3.4 but not in Python2.7

Question

I was trying to read a file in python2.7, and it was readen perfectly. The problem that I have is when I execute the same program in Python3.4 and then appear the error:

'utf-8' codec can't decode byte 0xf2 in position 424: invalid continuation byte'

Also, when I run the program in Windows (with python3.4), the error doesn't appear. The first line of the document is: Codi;Codi_lloc_anonim;Nom

and the code of my program is:

def lectdict(filename,colkey,colvalue):
    f = open(filename,'r')
    D = dict()

    for line in f:
       if line == '
': continue
       D[line.split(';')[colkey]] = D.get(line.split(';')[colkey],[]) + [line.split(';')[colvalue]]

f.close
return D

Traduccio = lectdict('Noms_departaments_centres.txt',1,2)

dyomas · Accepted Answer

In my case I can't change encoding because my file is really UTF-8 encoded. But some rows are corrupted and causes the same error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 7092: invalid continuation byte

My decision is to open file in binary mode:

open(filename, 'rb')

oscarcapote · Answer

Ok, I did the same as @unutbu tell me. The result was a lot of encodings one of these are cp1250, for that reason I change :

f = open(filename,'r')

to

f = open(filename,'r', encoding='cp1250')

like @triplee suggest me. And now I can read my files.

'utf-8' codec can't decode byte reading a file in Python3.4 but not in Python2.7

Tags:

python

python-3.x

utf-8

oscarcapote

2 Answers

dyomas

oscarcapote

Recent Activity

Donate For Us

'utf-8' codec can't decode byte reading a file in Python3.4 but not in Python2.7

Tags:

python

python-3.x

utf-8

oscarcapote

2 Answers

dyomas

oscarcapote

Related questions

Recent Activity

Donate For Us