UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1

Tags:

I'm having a few issues trying to encode a string to UTF-8. I've tried numerous things, including using string.encode('utf-8') and unicode(string), but I get the error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1: ordinal not in range(128)

This is my string:

(｡･ω･｡)ﾉ

I don't see what's going wrong, any idea?

Edit: The problem is that printing the string as it is does not show properly. Also, this error when I try to convert it:

Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = '(\xef\xbd\xa1\xef\xbd\xa5\xcf\x89\xef\xbd\xa5\xef\xbd\xa1)\xef\xbe\x89' >>> s1 = s.decode('utf-8') >>> print s1 Traceback (most recent call last):   File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-5: ordinal not in range(128)

424

asked May 12 '12 07:05

Markum

2 Answers

This is to do with the encoding of your terminal not being set to UTF-8. Here is my terminal

$ echo $LANG en_GB.UTF-8 $ python Python 2.7.3 (default, Apr 20 2012, 22:39:59)  [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = '(\xef\xbd\xa1\xef\xbd\xa5\xcf\x89\xef\xbd\xa5\xef\xbd\xa1)\xef\xbe\x89' >>> s1 = s.decode('utf-8') >>> print s1 (｡･ω･｡)ﾉ >>>

On my terminal the example works with the above, but if I get rid of the LANG setting then it won't work

$ unset LANG $ python Python 2.7.3 (default, Apr 20 2012, 22:39:59)  [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = '(\xef\xbd\xa1\xef\xbd\xa5\xcf\x89\xef\xbd\xa5\xef\xbd\xa1)\xef\xbe\x89' >>> s1 = s.decode('utf-8') >>> print s1 Traceback (most recent call last):   File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-5: ordinal not in range(128) >>>

Consult the docs for your linux variant to discover how to make this change permanent.

180

answered Oct 03 '22 04:10

Nick Craig-Wood

try:

string.decode('utf-8')  # or: unicode(string, 'utf-8')

edit:

'(\xef\xbd\xa1\xef\xbd\xa5\xcf\x89\xef\xbd\xa5\xef\xbd\xa1)\xef\xbe\x89'.decode('utf-8') gives u'(\uff61\uff65\u03c9\uff65\uff61)\uff89', which is correct.

so your problem must be at some oter place, possibly if you try to do something with it were there is an implicit conversion going on (could be printing, writing to a stream...)

to say more we'll need to see some code.

answered Oct 03 '22 05:10

mata

Related questions
                            
                                Python str vs unicode types
                            
                                Getting name of windows computer running python script?
                            
                                heapq with custom compare predicate
                            
                                How to import a text file on AWS S3 into pandas without writing to disk
                            
                                What is the difference between pandas.qcut and pandas.cut?
                            
                                Django class-based view: How do I pass additional parameters to the as_view method?
                            
                                How to create python bytes object from long hex string?
                            
                                How to use variables in SQL statement in Python?
                            
                                how to install python distutils
                            
                                How to get absolute path of a pathlib.Path object?
                            
                                Pip install Matplotlib error with virtualenv
                            
                                ImportError: No module named 'MySQL'
                            
                                Easy pretty printing of floats?
                            
                                How to set target hosts in Fabric file
                            
                                How to load a model from an HDF5 file in Keras?
                            
                                What does "'tests' module incorrectly imported" mean?
                            
                                What is a good way to handle exceptions when trying to read a file in python?
                            
                                What is a good way to order methods in a Python class?
                            
                                Django Model Field Default Based Off Another Field in Same Model
                            
                                Does Python optimize away a variable that's only used as a return value?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1

Tags:

python

unicode

utf-8

Markum

People also ask

2 Answers

Nick Craig-Wood

mata

Recent Activity

Donate For Us