In Python 3, bytes
requires an encoding:
bytes(s, encoding="utf-8")
Is there a way to set a default encoding, so bytes
always encodes in UTF-8?
The simplest way I imagine is
def bytes_utf8(s):
return bytes(s, encoding="utf-8")
By default in Python 3, we are on the left side in the world of Unicode code points for strings. We only need to go back and forth with bytes while writing or reading the data. Default encoding during this conversion is UTF-8, but other encodings can also be used.
sys. getdefaultencoding() method is used to get the current default string encoding used by the Unicode implementation.
Under Eclipse, run dialog settings ("run configurations", if I remember correctly); you can choose the default encoding on the common tab. Change it to US-ASCII if you want to have these errors 'early' (in other words: in your PyDev environment).
The documentation for bytes
redirects you to the documentation for bytearray
, which says in part:
The optional source parameter can be used to initialize the array in a few different ways:
- If it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str.encode().
It looks like there's no way to provide a default.
You can use the encode
method, which does have a default, given by sys.getdefaultencoding()
. If you need to change the default, check out this question but be aware that the capability to do it easily was removed for good reason.
import sys
print(sys.getdefaultencoding())
s.encode()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With