Some of my application's libraries are depending on being able to print UTF-8 characters to stdout and stderr. Therefore this must not fail:
print('\u2122')
On my local machine it works, but on my remote server it raises UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in position 0: ordinal not in range(128)
I tried $ PYTHONIOENCODING=utf8
with no apparent effect.
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
works for a while, then stalls and finally fails with ValueError: underlying buffer has been detached
sys.getdefaultencoding()
returns 'utf-8'
, and sys.stdout.encoding
returns 'ANSI_X3.4-1968'
What can I do? I don't want to edit third-party libraries.
Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.
UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding.
The UnicodeEncodeError normally happens when encoding a unicode string into a certain coding. Since codings map only a limited number of unicode characters to str strings, a non-presented character will cause the coding-specific encode() to fail. Encoding from unicode to str. >>>
In Python, the built-in functions chr() and ord() are used to convert between Unicode code points and characters. A character can also be represented by writing a hexadecimal Unicode code point with \x , \u , or \U in a string literal.
From @ShadowRanger's comment on my question,
PYTHONIOENCODING=utf8
won't work unless youexport
it (or prefix the Python launch with it). Otherwise, it's a local variable inbash
that isn't inherited in the environment of child processes.export PYTHONIOENCODING=utf-8
would both set and export it inbash
.
export PYTHONIOENCODING=utf-8
did the trick, UTF-8 characters no longer raise UnicodeEncodeError
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With