ElementTree and unicode

Tags:

I have this char in an xml file:

<data>   <products>       <color>fumè</color>   </product> </data>

I try to generate an instance of ElementTree with the following code:

string_data = open('file.xml') x = ElementTree.fromstring(unicode(string_data.encode('utf-8')))

and I get the following error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 185: ordinal not in range(128)

(NOTE: The position is not exact, I sampled the xml from a larger one).

How to solve it? Thanks

920

asked Sep 10 '12 10:09

pistacchio

2 Answers

Might you have stumbled upon this problem while using Requests (HTTP for Humans), response.text decodes the response by default, you can use response.content to get the undecoded data, so ElementTree can decode it itself. Just remember to use the correct encoding.

More info: http://docs.python-requests.org/en/latest/user/quickstart/#response-content

answered Sep 22 '22 07:09

gitaarik

You need to decode utf-8 strings into a unicode object. So

string_data.encode('utf-8')

should be

string_data.decode('utf-8')

assuming string_data is actually an utf-8 string.

So to summarize: To get an utf-8 string from a unicode object you encode the unicode (using the utf-8 encoding), and to turn a string to a unicode object you decode the string using the respective encoding.

For more details on the concepts I suggest reading The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (not Python specific).

answered Sep 24 '22 07:09

Lukas Graf

Related questions
                            
                                I want to display the file Name in the log statement
                            
                                command line find first file in a directory
                            
                                Listen for multiple events on a $scope
                            
                                Add more space between items in Android Spinner without custom style?
                            
                                Split array into a specific number of chunks
                            
                                Android PopupWindow and WRAP_CONTENT don't work together
                            
                                Query all table data and index compression
                            
                                Sort string array containing time in format '09:00 AM'?
                            
                                PHPExcel - set cell type before writing a value in it
                            
                                How can I select all the text within a Windows Forms textbox? [closed]
                            
                                Create New Line While in Insert Mode
                            
                                How to disable rotating to landscape mode? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With