Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading non-ASCII characters from a text file

I'm using python 2.7. I've tried many things like codecs but didn't work. How can I fix this.

myfile.txt

wörd

My code

f = open('myfile.txt','r')
for line in f:
    print line
f.close()

Output

s\xc3\xb6zc\xc3\xbck

Output is same on eclipse and command window. I'm using Win7. There is no problem with any characters when I don't read from a file.

like image 527
Rckt Avatar asked Apr 29 '12 23:04

Rckt


1 Answers

import codecs
#open it with utf-8 encoding 
f=codecs.open("myfile.txt","r",encoding='utf-8')
#read the file to unicode string
sfile=f.read()

#check the encoding type
print type(file) #it's unicode

#unicode should be encoded to standard string to display it properly
print sfile.encode('utf-8')
#check the type of encoded string

print type(sfile.encode('utf-8'))
like image 71
Biruk Demelash Avatar answered Sep 28 '22 08:09

Biruk Demelash