Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python read non-ascii text file

Tags:

python

utf-8

I am trying to load a text file, which contains some German letters with

content=open("file.txt","r").read() 

which results in this error message

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 26: ordinal not in range(128)

if I modify the file to contain only ASCII characters everything works as expected.

Apperently using

content=open("file.txt","rb").read() 

or

content=open("file.txt","r",encoding="utf-8").read()

both do the job.

Why is it possible to read with "binary" mode and get the same result as with utf-8 encoding?

like image 365
Paul Würtz Avatar asked Feb 16 '26 14:02

Paul Würtz


1 Answers

In Python 3, using 'r' mode and not specifying an encoding just uses a default encoding, which in this case is ASCII. Using 'rb' mode reads the file as bytes and makes no attempt to interpret it as a string of characters.

like image 110
jbuchman Avatar answered Feb 19 '26 04:02

jbuchman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!