Python: what does "...".encode("utf8") fix?

Tags:

I wanted to url encode a python string and got exceptions with hebrew strings. I couldn't fix it and started doing some guess oriented programming. Finally, doing mystr = mystr.encode("utf8") before sending it to the url encoder saved the day.

Can somebody explain what happened? What does .encode("utf8") do? My original string was a unicode string anyways (i.e. prefixed by a u).

525

asked Jul 20 '10 14:07

flybywire

1 Answers

My original string was a unicode string anyways (i.e. prefixed by a u)

...which is the problem. It wasn't a "string", as such, but a "Unicode object". It contains a sequence of Unicode code points. These code points must, of course, have some internal representation that Python knows about, but whatever that is is abstracted away and they're shown as those \uXXXX entities when you print repr(my_u_str).

To get a sequence of bytes that another program can understand, you need to take that sequence of Unicode code points and encode it. You need to decide on the encoding, because there are plenty to choose from. UTF8 and UTF16 are common choices. ASCII could be too, if it fits. u"abc".encode('ascii') works just fine.

Do my_u_str = u"\u2119ython" and then type(my_u_str) and type(my_u_str.encode('utf8')) to see the difference in types: The first is <type 'unicode'> and the second is <type 'str'>. (Under Python 2.5 and 2.6, anyway).

Things are different in Python 3, but since I rarely use it I'd be talking out of my hat if I tried to say anything authoritative about it.

200

answered Oct 17 '22 13:10

detly

Related questions
                            
                                what is major difference between histogram,countplot and distplot in Seaborn library?
                            
                                Error in installing Matplotlib : fatal error C1083
                            
                                How to count the occurrence of values in one pandas Dataframe if the values to count are in another (in a faster way)?
                            
                                "Pythonic" way to return elements from an iterable as long as a condition based on previous element is true
                            
                                What Python way would you suggest to check whois database records?
                            
                                What are the pros and cons of the various Python implementations?
                            
                                How to implement property() with dynamic name (in python)
                            
                                Tool (or combination of tools) for reproducible environments in Python
                            
                                Find all strings in python code files
                            
                                os.system() execute command under which linux shell?
                            
                                Running Python code contained in a string
                            
                                Form validation in django
                            
                                Pythonic way of iterating over 3D array
                            
                                Python interpreter as a c++ class
                            
                                Upper limit in Python time.sleep()?
                            
                                Sorting elements in string with Python
                            
                                Python regex look-behind requires fixed-width pattern
                            
                                Python del() built-in can't be used in assignment?
                            
                                Python subprocess Help
                            
                                Modify python script to run on every file in a directory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: what does "...".encode("utf8") fix?

Tags:

python

urlencode

unicode

utf-8

internationalization

flybywire

People also ask

1 Answers

detly

Recent Activity

Donate For Us