In Python 2.7, see the following error, when trying to cast type to ensure it matches the output schema.
UnicodeEncodeError: 'ascii' codec can't encode character in position 0: ordinal not in range(128) Tried to find why and reproduced the error in Jupiter. By simply typing in.
str(u'\u2013')
What is the way to cast type to string that can handle this type of error? Thanks!
In case you are facing ordinal not in range 128 error it is because you are converting unicode to encoded bytes using str, so to solve the problem you require to stop str and instead use . encode() to properly encode the strings. Syntax- str.encode(encoding="utf-8",errors="strict")
Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.
The UnicodeEncodeError normally happens when encoding a unicode string into a certain coding. Since codings map only a limited number of unicode characters to str strings, a non-presented character will cause the coding-specific encode() to fail. Encoding from unicode to str. >>>
Try this:
u'\u2013'.encode('utf-8')
I will answer my own question. Found an duplicated question. stackoverflow.com/questions/9942594/
But for simplicity, here is an elegant solution that works well with my use case:
def safe_str(obj):
try: return str(obj)
except UnicodeEncodeError:
return obj.encode('ascii', 'ignore').decode('ascii')
return ""
safe_str(u'\u2013')
Or simply use:
u'\u2013'.encode('ascii', 'ignore')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With