Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing the “locale preferred encoding” in Python 3 in Windows

I'm using Python 3 (recently switched from Python 2). My code usually runs on Linux but also sometimes (not often) on Windows. According to Python 3 documentation for open(), the default encoding for a text file is from locale.getpreferredencoding() if the encoding arg is not supplied. I want this default value to be utf-8 for a project of mine, no matter what OS it's running on (currently, it's always UTF-8 for Linux, but not for Windows). The project has many many calls to open() and I don't want to add encoding='utf-8' to all of them. Thus, I want to change the locale's preferred encoding in Windows, as Python 3 sees it.

I found a previous question "Changing the "locale preferred encoding"", which has an accepted answer, so I thought I was good to go. But unfortunately, neither of the suggested commands in that answer and its first comment work for me in Windows. Specifically, that accepted answer and its first comment suggest running chcp 65001 and set PYTHONIOENCODING=UTF-8, and I've tried both. Please see transcript below from my cmd window:

> py -i
Python 3.4.3 ...
>>> f = open('foo.txt', 'w')
>>> f.encoding
'cp1252'
>>> exit()

> chcp 65001
Active code page: 65001

> py -i
Python 3.4.3 ...
>>> f = open('foo.txt', 'w')
>>> f.encoding
'cp1252'
>>> exit()

> set PYTHONIOENCODING=UTF-8

> py -i
Python 3.4.3 ...
>>> f = open('foo.txt', 'w')
>>> f.encoding
'cp1252'
>>> exit()

Note that even after both suggested commands, my opened file's encoding is still cp1252 instead of the intended utf-8.

like image 784
walrus Avatar asked Jul 17 '15 06:07

walrus


3 Answers

As of python3.5.1 this hack looks like this:

import _locale
_locale._getdefaultlocale = (lambda *args: ['en_US', 'utf8'])

All files opened thereafter will assume the default encoding to be utf8.

like image 132
axil Avatar answered Sep 22 '22 14:09

axil


i know its a real hacky workaround, but you could redefine the locale.getpreferredencoding() function like so:

import locale
def getpreferredencoding(do_setlocale = True):
    return "utf-8"
locale.getpreferredencoding = getpreferredencoding

if you run this early on, all files opened after (at lest in my testing on a win xp machine) open in utf-8, and as this overrides the module method this would apply to all platforms.

like image 22
James Kent Avatar answered Sep 25 '22 14:09

James Kent


Locale can be set in windows globally to UTF-8, if you so desire, as follows:

Control panel -> Clock and Region -> Region -> Administrative -> Change system locale -> Check Beta: Use Unicode UTF-8 ...

After this, and a reboot, I confirmed that locale.getpreferredencoding() returns 'cp65001' (=UTF-8) and that functions like open default to UTF-8.

like image 42
JBSnorro Avatar answered Sep 23 '22 14:09

JBSnorro