Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python UnicodeEncodeError: 'ascii' codec can't encode character in position 0: ordinal not in range(128) [duplicate]

In Python 2.7, see the following error, when trying to cast type to ensure it matches the output schema.

UnicodeEncodeError: 'ascii' codec can't encode character in position 0: ordinal not in range(128) Tried to find why and reproduced the error in Jupiter. By simply typing in.

str(u'\u2013')

What is the way to cast type to string that can handle this type of error? Thanks!

like image 704
Bin Avatar asked Jan 19 '18 18:01

Bin


People also ask

What is ordinal not in range 128?

In case you are facing ordinal not in range 128 error it is because you are converting unicode to encoded bytes using str, so to solve the problem you require to stop str and instead use . encode() to properly encode the strings. Syntax- str.encode(encoding="utf-8",errors="strict")

How do I fix unicode encode errors in Python?

Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.

What is UnicodeEncodeError?

The UnicodeEncodeError normally happens when encoding a unicode string into a certain coding. Since codings map only a limited number of unicode characters to str strings, a non-presented character will cause the coding-specific encode() to fail. Encoding from unicode to str. >>>


2 Answers

Try this:

u'\u2013'.encode('utf-8')
like image 133
akhilsp Avatar answered Oct 16 '22 18:10

akhilsp


I will answer my own question. Found an duplicated question. stackoverflow.com/questions/9942594/

But for simplicity, here is an elegant solution that works well with my use case:

def safe_str(obj):
    try: return str(obj)
    except UnicodeEncodeError:
        return obj.encode('ascii', 'ignore').decode('ascii')
    return ""

safe_str(u'\u2013')

Or simply use:

u'\u2013'.encode('ascii', 'ignore')
like image 24
Bin Avatar answered Oct 16 '22 19:10

Bin