>>> import string
>>> import locale
>>> string.letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> locale.getpreferredencoding()
'UTF-8'
>>> string.letters
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
Any workarounds for this?
Platform: Linux
Python2.6.7 and Python2.7.3 seem to be affected, Works fine in Python3 (with ascii_letters
)
The following are 30 code examples of locale.getpreferredencoding () . These examples are extracted from open source projects. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example.
By default, Python tries to honor the Unix locale system, including the LC_ALL, LC_CTYPE, and LANG environment variables. In theory, standards are good, but in my experience these variables only cause problems.
def decode_as_string(text, encoding=None): """ Decode the console or file output explicitly using getpreferredencoding. The text paraemeter should be a encoded string, if not no decode occurs If no encoding is given, getpreferredencoding is used. If encoding is specified, that is used instead.
Note: what OP did to solve the issue is to pass encoding='UTF-8'
to the open
call. If you run into this issue and are just looking for a fix this works. The rest of the post is an emphasis on why.
As Lukas said, the docs specify:
On some systems, it is necessary to invoke setlocale() to obtain the user preferences
Initially, string.letters is set to returning lowercase + uppercase
:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
letters = lowercase + uppercase
However, when you call getpreferredencoding()
, the _locale
module overrides it by calling PyDict_SetItemString(string, "letters", ulo);
after it generates them inside fixup_ulcase(void)
with the following:
/* create letters string */
n = 0;
for (c = 0; c < 256; c++) {
if (isalpha(c))
ul[n++] = c;
}
ulo = PyString_FromStringAndSize((const char *)ul, n);
if (!ulo)
return;
if (string)
PyDict_SetItemString(string, "letters", ulo);
Py_DECREF(ulo);
In turn, this is called in PyLocale_setlocale
which is indeed setlocale
, which is called by getpreferredencoding
- code here http://hg.python.org/cpython/file/07a6fca7ff42/Lib/locale.py#l612 :
def getpreferredencoding(do_setlocale = True):
"""Return the charset that the user is likely using,
according to the system configuration."""
if do_setlocale:
oldloc = setlocale(LC_CTYPE)
try:
setlocale(LC_CTYPE, "")
except Error:
pass
result = nl_langinfo(CODESET)
setlocale(LC_CTYPE, oldloc)
return result
else:
return nl_langinfo(CODESET)
Try getpreferredencoding(False)
Windows uses different code for getting the locale, as you can see here.
In Python 3, getdefaultlocale
does not accept a boolean setlocale variable and does not call setlocale itself as you can see here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With