Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2

Tags:

python

I am creating XML file in Python and there's a field on my XML that I put the contents of a text file. I do it by

f = open ('myText.txt',"r") data = f.read() f.close()  root = ET.Element("add") doc = ET.SubElement(root, "doc")  field = ET.SubElement(doc, "field") field.set("name", "text") field.text = data  tree = ET.ElementTree(root) tree.write("output.xml") 

And then I get the UnicodeDecodeError. I already tried to put the special comment # -*- coding: utf-8 -*- on top of my script but still got the error. Also I tried already to enforce the encoding of my variable data.encode('utf-8') but still got the error. I know this issue is very common but all the solutions I got from other questions didn't work for me.

UPDATE

Traceback: Using only the special comment on the first line of the script

Traceback (most recent call last):   File "D:\Python\lse\createxml.py", line 151, in <module>     tree.write("D:\\python\\lse\\xmls\\" + items[ctr][0] + ".xml")   File "C:\Python27\lib\xml\etree\ElementTree.py", line 820, in write     serialize(write, self._root, encoding, qnames, namespaces)   File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml     _serialize_xml(write, e, encoding, qnames, None)   File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml     _serialize_xml(write, e, encoding, qnames, None)   File "C:\Python27\lib\xml\etree\ElementTree.py", line 937, in _serialize_xml     write(_escape_cdata(text, encoding))   File "C:\Python27\lib\xml\etree\ElementTree.py", line 1073, in _escape_cdata     return text.encode(encoding, "xmlcharrefreplace") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 243: ordina l not in range(128) 

Traceback: Using .encode('utf-8')

Traceback (most recent call last):   File "D:\Python\lse\createxml.py", line 148, in <module>     field.text = data.encode('utf-8') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 227: ordina l not in range(128) 

I used .decode('utf-8') and the error message didn't appear and it successfully created my XML file. But the problem is that the XML is not viewable on my browser.

like image 981
kagat-kagat Avatar asked May 12 '13 14:05

kagat-kagat


People also ask

What is Unicode decode error?

The UnicodeDecodeError normally happens when decoding an str string from a certain coding. Since codings map only a limited number of str strings to unicode characters, an illegal sequence of str characters will cause the coding-specific decode() to fail.

How do I fix UnicodeEncodeError in Python?

Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.


1 Answers

You need to decode data from input string into unicode, before using it, to avoid encoding problems.

field.text = data.decode("utf8") 
like image 124
uhbif19 Avatar answered Sep 20 '22 08:09

uhbif19