I'm using python 2.6.2's xml.etree.cElementTree to create an xml document:
import xml.etree.cElementTree as etree
elem = etree.Element('tag')
elem.text = (u"Würth Elektronik Midcom").encode('utf-8')
xml = etree.tostring(elem,encoding='UTF-8')
At the end of the day, xml looks like:
<?xml version='1.0' encoding='UTF-8'?>
<tag>Würth Elektronik Midcom</tag>
It looks like tostring ignored the encoding parameter and encoded 'ü' into some other character encoding ('ü' is a valid utf-8 encoding, I'm fairly sure).
Any advice as to what I'm doing wrong would be greatly appreciated.
etree.tostring(elem, encoding=str)
will return str
but not binary
in Python 3
You can also serialise to a Unicode string without declaration by passing the
unicode
function as encoding (orstr
in Py3), or the name 'unicode'. This changes the return value from a byte string to an unencoded unicode string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With