I am trying to emit an XML file with element-tree that contains an XML declaration and namespaces. Here is my sample code:
from xml.etree import ElementTree as ET ET.register_namespace('com',"http://www.company.com") #some name # build a tree structure root = ET.Element("STUFF") body = ET.SubElement(root, "MORE_STUFF") body.text = "STUFF EVERYWHERE!" # wrap it in an ElementTree instance, and save as XML tree = ET.ElementTree(root) tree.write("page.xml", xml_declaration=True, method="xml" )
However, neither the <?xml
tag comes out nor any namespace/prefix information. I'm more than a little confused here.
ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with.
There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.
Although the docs say otherwise, I only was able to get an <?xml>
declaration by specifying both the xml_declaration and the encoding.
You have to declare nodes in the namespace you've registered to get the namespace on the nodes in the file. Here's a fixed version of your code:
from xml.etree import ElementTree as ET ET.register_namespace('com',"http://www.company.com") #some name # build a tree structure root = ET.Element("{http://www.company.com}STUFF") body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF") body.text = "STUFF EVERYWHERE!" # wrap it in an ElementTree instance, and save as XML tree = ET.ElementTree(root) tree.write("page.xml", xml_declaration=True,encoding='utf-8', method="xml")
<?xml version='1.0' encoding='utf-8'?><com:STUFF xmlns:com="http://www.company.com"><com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF></com:STUFF>
ElementTree doesn't pretty-print either. Here's pretty-printed output:
<?xml version='1.0' encoding='utf-8'?> <com:STUFF xmlns:com="http://www.company.com"> <com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF> </com:STUFF>
You can also declare a default namespace and don't need to register one:
from xml.etree import ElementTree as ET # build a tree structure root = ET.Element("{http://www.company.com}STUFF") body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF") body.text = "STUFF EVERYWHERE!" # wrap it in an ElementTree instance, and save as XML tree = ET.ElementTree(root) tree.write("page.xml", xml_declaration=True,encoding='utf-8', method="xml",default_namespace='http://www.company.com')
<?xml version='1.0' encoding='utf-8'?> <STUFF xmlns="http://www.company.com"> <MORE_STUFF>STUFF EVERYWHERE!</MORE_STUFF> </STUFF>
I've never been able to get the <?xml
tag out of the element tree libraries programatically so I'd suggest you try something like this.
from xml.etree import ElementTree as ET root = ET.Element("STUFF") root.set('com','http://www.company.com') body = ET.SubElement(root, "MORE_STUFF") body.text = "STUFF EVERYWHERE!" f = open('page.xml', 'w') f.write('<?xml version="1.0" encoding="UTF-8"?>' + ET.tostring(root)) f.close()
Non std lib python ElementTree implementations may have different ways to specify namespaces, so if you decide to move to lxml, the way you declare those will be different.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With