Setting the default output encoding in Python 2 is a well-known idiom:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout)
This wraps the sys.stdout
object in a codec writer that encodes output in UTF-8.
However, this technique does not work in Python 3 because sys.stdout.write()
expects a str
, but the result of encoding is bytes
, and an error occurs when codecs
tries to write the encoded bytes to the original sys.stdout
.
What is the correct way to do this in Python 3?
stdout. A built-in file object that is analogous to the interpreter's standard output stream in Python. stdout is used to display output directly to the screen console.
UTF-8 is a byte oriented encoding. The encoding specifies that each character is represented by a specific sequence of one or more bytes.
String Encoding Since Python 3.0, strings are stored as Unicode, i.e. each character in the string is represented by a code point. So, each string is just a sequence of Unicode code points. For efficient storage of these strings, the sequence of code points is converted into a set of bytes.
Since Python 3.7 you can change the encoding of standard streams with reconfigure()
:
sys.stdout.reconfigure(encoding='utf-8')
You can also modify how encoding errors are handled by adding an errors
parameter.
Python 3.1 added io.TextIOBase.detach()
, with a note in the documentation for sys.stdout
:
The standard streams are in text mode by default. To write or read binary data to these, use the underlying binary buffer. For example, to write bytes to
stdout
, usesys.stdout.buffer.write(b'abc')
. Usingio.TextIOBase.detach()
streams can be made binary by default. This function setsstdin
andstdout
to binary:def make_streams_binary(): sys.stdin = sys.stdin.detach() sys.stdout = sys.stdout.detach()
Therefore, the corresponding idiom for Python 3.1 and later is:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With