Python what's the difference between str(u'a') and u'a'.encode('utf-8')

Tags:

python

unicode

As title, is there a reason not to use str() to cast unicode string to str??

>>> str(u'a')
'a'
>>> str(u'a').__class__
<type 'str'>
>>> u'a'.encode('utf-8')
'a'
>>> u'a'.encode('utf-8').__class__
<type 'str'>
>>> u'a'.encode().__class__
<type 'str'>

UPDATE: thanks for the answer, also didn't know if I create a string using special character it will automatically convert to utf-8

>>> a = '€'
>>> a.__class__
<type 'str'>
>>> a
'\xe2\x82\xac'

Also is a Unicode object in python 3

802

asked Aug 27 '12 21:08

James Lin

1 Answers

When you write str(u'a') it converts the Unicode string to a bytestring using the default encoding which (unless you've gone to the trouble of changing it) will be ASCII.

The second version explicitly encodes the string as UTF-8.

The difference is more apparent if you try with a string containing non-ASCII characters. The second version will still work:

>>> u'€'.encode('utf-8')
'\xc2\x80'

The first version will give an exception:

>>> str(u'€')

Traceback (most recent call last):
  File "", line 1, in 
    str(u'€')
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)

answered Oct 06 '22 23:10

Mark Byers

Related questions
                            
                                SQLite3 and Multiprocessing
                            
                                Django - Template display model verbose_names & objects
                            
                                Using Mock() in Python
                            
                                Calculate number of days between two dates inside Django templates
                            
                                How do I inspect a Python's class hierarchy?
                            
                                How can I improve the efficiency of this numpy loop
                            
                                TypeError: argument of type 'int' is not iterable
                            
                                Efficiently create 2d histograms from large datasets
                            
                                Recommended way to initialize variable in if block
                            
                                Which of lxml and libxml2 is better for parsing malformed html in Python?
                            
                                Pythonic way to eval all octal values in a string as integers
                            
                                Python: Calculate factorial of a non-integral number
                            
                                Differences between subprocess module, envoy, sarge and pexpect?
                            
                                Extract part of 2D-List/Matrix/List of lists in Python
                            
                                large scale clustering library possibly with python bindings
                            
                                python -c and `while`
                            
                                Text-to-ASCII art generator in Python
                            
                                Scipy Derivative
                            
                                How do I add sheet name for each datasheet in an XLS file generated from "tablib"?
                            
                                Iterate through table in Selenium 2 WebDriver (python)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With