Setting the default output encoding in Python 2 is a well-known idiom:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout) This wraps the sys.stdout object in a codec writer that encodes output in UTF-8.
However, this technique does not work in Python 3 because sys.stdout.write() expects a str, but the result of encoding is bytes, and an error occurs when codecs tries to write the encoded bytes to the original sys.stdout.
What is the correct way to do this in Python 3?
stdout. A built-in file object that is analogous to the interpreter's standard output stream in Python. stdout is used to display output directly to the screen console.
UTF-8 is a byte oriented encoding. The encoding specifies that each character is represented by a specific sequence of one or more bytes.
String Encoding Since Python 3.0, strings are stored as Unicode, i.e. each character in the string is represented by a code point. So, each string is just a sequence of Unicode code points. For efficient storage of these strings, the sequence of code points is converted into a set of bytes.
Since Python 3.7 you can change the encoding of standard streams with reconfigure():
sys.stdout.reconfigure(encoding='utf-8') You can also modify how encoding errors are handled by adding an errors parameter.
Python 3.1 added io.TextIOBase.detach(), with a note in the documentation for sys.stdout:
The standard streams are in text mode by default. To write or read binary data to these, use the underlying binary buffer. For example, to write bytes to
stdout, usesys.stdout.buffer.write(b'abc'). Usingio.TextIOBase.detach()streams can be made binary by default. This function setsstdinandstdoutto binary:def make_streams_binary(): sys.stdin = sys.stdin.detach() sys.stdout = sys.stdout.detach()
Therefore, the corresponding idiom for Python 3.1 and later is:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With