I get a weird problem with <code>__future__.unicode_literals</code> in Python. Without importing <code>unicode_literals</code> I get the correct output: <pre class="prettyprint"><code># encoding: utf-8 # from __future__ import unicode_literals name = 'helló wörld from example' print name </code></pre> But when I add the <code>unicode_literals</code> import: <pre class="prettyprint"><code># encoding: utf-8 from __future__ import unicode_literals name = 'helló wörld from example' print name </code></pre> I got this error: <pre class="prettyprint"><code>UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 4: ordinal not in range(128) </code></pre> Does <code>unicode_literals</code> encode every string as an utf-8? What should I do to override this error?

Your terminal or console is failing to let Python know it supports UTF-8. Without the <code>from __future__ import unicode_literals</code> line, you are building a byte string that holds UTF-8 encoded bytes. With the string you are building a <code>unicode</code> string. <code>print</code> has to treat these two values differently; a byte string is written to <code>sys.stdout</code> unchanged. A <code>unicode</code> string is encoded to bytes first, and Python consults <code>sys.stdout.encoding</code> for that. If your system doesn't correctly tell Python what codec it supports, the default is to use ASCII. Your system failed to tell Python what codec to use; <code>sys.stdout.encoding</code> is set to ASCII, and encoding the <code>unicode</code> value to print failed. You can verify this by manually encoding to UTF-8 when printing: <pre class="prettyprint"><code># encoding: utf-8 from __future__ import unicode_literals name = 'helló wörld from example' print name.encode('utf8') </code></pre> and you can reproduce the issue by creating unicode literals without the <code>from __future__</code> import statement too: <pre class="prettyprint"><code># encoding: utf-8 name = u'helló wörld from example' print name </code></pre> where <code>u'..'</code> is a unicode literal too. Without details on what your environment is, it is hard to say what the solution is; this depends very much on the OS and console or terminal used.

What is unicode_literals used for?

Tags:

python

encoding

unicode

utf-8

I get a weird problem with __future__.unicode_literals in Python. Without importing unicode_literals I get the correct output:

# encoding: utf-8 # from __future__ import unicode_literals name = 'helló wörld from example' print name

But when I add the unicode_literals import:

# encoding: utf-8 from __future__ import unicode_literals name = 'helló wörld from example' print name

I got this error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 4: ordinal not in range(128)

Does unicode_literals encode every string as an utf-8? What should I do to override this error?

865

asked Apr 29 '14 16:04

ssj

1 Answers

Your terminal or console is failing to let Python know it supports UTF-8.

Without the from __future__ import unicode_literals line, you are building a byte string that holds UTF-8 encoded bytes. With the string you are building a unicode string.

print has to treat these two values differently; a byte string is written to sys.stdout unchanged. A unicode string is encoded to bytes first, and Python consults sys.stdout.encoding for that. If your system doesn't correctly tell Python what codec it supports, the default is to use ASCII.

Your system failed to tell Python what codec to use; sys.stdout.encoding is set to ASCII, and encoding the unicode value to print failed.

You can verify this by manually encoding to UTF-8 when printing:

# encoding: utf-8 from __future__ import unicode_literals name = 'helló wörld from example' print name.encode('utf8')

and you can reproduce the issue by creating unicode literals without the from __future__ import statement too:

# encoding: utf-8 name = u'helló wörld from example' print name

where u'..' is a unicode literal too.

Without details on what your environment is, it is hard to say what the solution is; this depends very much on the OS and console or terminal used.

answered Sep 22 '22 06:09

Martijn Pieters

Related questions
                            
                                Set LD_LIBRARY_PATH before importing in python
                            
                                What does the standard Keras model output mean? What is epoch and loss in Keras?
                            
                                Optional dependencies in a pip requirements file
                            
                                How to set the pandas dataframe data left/right alignment?
                            
                                Python Multiprocessing Exit Elegantly How?
                            
                                What does "SSLError: [SSL] PEM lib (_ssl.c:2532)" mean using the Python ssl library?
                            
                                Python can't find my module
                            
                                Running interactive commands in Paramiko
                            
                                Very strange behavior of operator 'is' with methods
                            
                                How do I make coverage include not tested files?
                            
                                Concatenation of the result of a function with a mutable default argument
                            
                                Python decorator as a staticmethod
                            
                                What are the URL parameters? (element at position #3 in urlparse result)
                            
                                save numpy array in append mode
                            
                                Using .pth files
                            
                                How to mock.patch a class imported in another module
                            
                                Requirements.txt greater than equal to and then less than?
                            
                                Python Error: "ValueError: need more than 1 value to unpack"
                            
                                Optional parameters in functions and their mutable default values [duplicate]
                            
                                Difference between np.int, np.int_, int, and np.int_t in cython?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With