Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 UnicodeEncodeError: 'ascii' codec can't encode characters

I've just started to learn Python but I already ran into troubles.
I have a simple script with just one command:

#!/usr/bin/env python3
print("Příliš žluťoučký kůň úpěl ďábelské ódy.") # Text in Czech 

When I try to run this script:

python3 hello.py 

I get this message:

Traceback (most recent call last):
  File "hello.py", line 2, in <module>
    print("P\u0159\xedli\u0161 \u017elu\u0165ou\u010dk\xfd k\u016fn \xfap\u011bl \u010f\xe1belsk\xe9 \xf3dy.")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

I am using Kubuntu 16.04 and Python 3.5.2. When I tried this: export PYTHONIOENCODING=utf-8 It worked but only temporarily. Next time I opened bash I got the same error.

According to https://docs.python.org/3/howto/unicode.html#the-string-type the default encoding for Python source code is UTF-8.
So I have the source file saved id UTF-8, Konsole is set to UTF-8 but I still get the error!
Even if I add

# -*- coding: utf-8 -*-

to the beginning it does nothing.

Another weird thing: when I run it using only python, not python3, it works. How is it possible to work in Python 2.7.12 and not in 3.5.2?

Any ideas for solving this permanently? Thank you.

like image 420
oogabooga Avatar asked Dec 31 '16 13:12

oogabooga


1 Answers

Thanks to Mark Tolen and Alastair McCormack for suggesting where the problem may be. The problem was really in the locale settings.
When I ran locale, the output was:

LANG=C
LANGUAGE=
LC_CTYPE="C"
LC_NUMERIC=cs_CZ.UTF-8
LC_TIME=cs_CZ.UTF-8
LC_COLLATE=cs_CZ.UTF-8
LC_MONETARY=cs_CZ.UTF-8
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT=cs_CZ.UTF-8
LC_IDENTIFICATION="C"
LC_ALL=

This "C" is the default setting which uses the ANSI charmap. And that is where the problem was. Running locale charmap gave me: ANSI_X3.4-1968 which can not display non-English characters.
I fixed this using this Ubuntu documentation site.

I added these lines to /etc/default/locale:

LANGUAGE=cs_CZ.UTF-8
LC_ALL=cs_CZ.UTF-8

Then you have to restart your session (log out and in) to apply these settings.

Running locale now returns this output:

LANG=C
LANGUAGE=cs
LC_CTYPE="cs_CZ.UTF-8"
LC_NUMERIC="cs_CZ.UTF-8"
LC_TIME="cs_CZ.UTF-8"
LC_COLLATE="cs_CZ.UTF-8"
LC_MONETARY="cs_CZ.UTF-8"
LC_MESSAGES="cs_CZ.UTF-8"
LC_PAPER="cs_CZ.UTF-8"
LC_NAME="cs_CZ.UTF-8"
LC_ADDRESS="cs_CZ.UTF-8"
LC_TELEPHONE="cs_CZ.UTF-8"
LC_MEASUREMENT="cs_CZ.UTF-8"
LC_IDENTIFICATION="cs_CZ.UTF-8"
LC_ALL=cs_CZ.UTF-8

and running locale charmap returns:

UTF-8
like image 133
oogabooga Avatar answered Nov 13 '22 20:11

oogabooga