I am writing some code that needs to work with both Py2.7 and Py3.7+.
I need to write text to a file using UTF-8 encoding. My code looks like this:
import six
...
content = ...
if isinstance(content, six.string_types):
content = content.encode(encoding='utf-8', errors='strict')
# write 'content' to file
Above, is it possible for content.encode() to raise UnicodeError from either Py2.7 or Py3.7+? I cannot think of a scenario where this is possible. I am not a Python expert, so I think there there must be an edge case.
Here is my reasoning why I think it will never raise UnicodeError:
six.string_types covers three types: Py2.7 str & unicode, Py3.7+ strYes, it's possible:
import six
content = ''.join(map(chr, range(0x110000)))
if isinstance(content, six.string_types):
content = content.encode(encoding='utf-8', errors='strict')
Result (Try it online!, using Python 3.7.4):
Traceback (most recent call last):
File ".code.tio", line 5, in <module>
content = content.encode(encoding='utf-8', errors='strict')
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 55296-57343: surrogates not allowed
And UnicodeEncodeErrors are UnicodeErrors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With