Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does locale.getpreferredencoding() return 'ANSI_X3.4-1968' instead of 'UTF-8'?

Whenever I try to read UTF-8 encoded text files, using open(file_name, encoding='utf-8'), I always get an error saying ASCII codec can't decode some characters (eg. when using for line in f: print(line))

Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getpreferredencoding()
'ANSI_X3.4-1968'
>>> import sys
>>> sys.getfilesystemencoding()
'ascii'
>>>

and locale command prints:

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=en_HK.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
like image 474
jm33_m0 Avatar asked Jun 03 '17 13:06

jm33_m0


1 Answers

I had a similar problem. For me, initially the environtment variable LANG was not set (you can check this by running env)

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
(None, None)
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

The available locales for me was (on a fresh Ubuntu 18.04 Docker image):

$ locale -a
C
C.UTF-8
POSIX

So i picked the utf-8 one:

$ export LANG="C.UTF-8"

And then things work

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
('en_US', 'UTF-8')
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
UTF-8

If you pick a locale that is not avaiable, such as

export LANG="en_US.UTF-8"

it will not work:

$ python3 -c 'import locale; print(locale.getdefaultlocale())'
('en_US', 'UTF-8')
$ python3 -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

and this is why locale is giving the error messages:

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
like image 112
RasmusWL Avatar answered Oct 23 '22 10:10

RasmusWL