Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to solve UnicodeDecodeError in Python 3.6?

I am switched from Python 2.7 to Python 3.6.

I have scripts that deal with some non-English content.

I usually run scripts via Cron and also in Terminal.

I had UnicodeDecodeError in my Python 2.7 scripts and I solved by this.

# encoding=utf8  
import sys  

reload(sys)  
sys.setdefaultencoding('utf8')

Now in Python 3.6, it doesnt work. I have print statements like print("Here %s" % (myvar)) and it throws error. I can solve this issue by replacing it to myvar.encode("utf-8") but I don't want to write with each print statement.

I did PYTHONIOENCODING=utf-8 in my terminal and I have still that issue.

Is there a cleaner way to solve UnicodeDecodeError issue in Python 3.6?

is there any way to tell Python3 to print everything in utf-8? just like I did in Python2?

like image 715
Umair Ayub Avatar asked Jun 25 '18 14:06

Umair Ayub


People also ask

What does UnicodeDecodeError mean in Python?

The Python "UnicodeDecodeError: 'ascii' codec can't decode byte in position" occurs when we use the ascii codec to decode bytes that were encoded using a different codec. To solve the error, specify the correct encoding, e.g. utf-8 .


1 Answers

It sounds like your locale is broken and have another bytes->Unicode issue. The thing you did for Python 2.7 is a hack that only masked the real problem (there's a reason why you have to reload sys to make it work).

To fix your locale, try typing locale from the command line. It should look something like:

LANG=en_GB.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_ALL=

locale depends on LANG being set properly. Python effectively uses locale to work out what encoding to use when writing to stdout in. If it can't work it out, it defaults to ASCII.

You should first attempt to fix your locale. If locale errors, make sure you've installed the correct language pack for your region.

If all else fails, you can always fix Python by setting PYTHONIOENCODING=UTF-8. This should be used as a last resort as you'll be masking problems once again.

If Python is still throwing an error after setting PYTHONIOENCODING then please update your question with the stacktrace. Chances are you've got an implied conversion going on.

like image 104
Alastair McCormack Avatar answered Oct 19 '22 00:10

Alastair McCormack