I am trying to get a compact representation of namespaces in ElementTree or lxml when the sub elements are in a different namespace as the parent. Here is the basic example:
from lxml import etree
country = etree.Element("country")
name = etree.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = etree.SubElement(country, "{urn:test}population")
population.text = "34M"
etree.register_namespace('tst', 'urn:test')
print( etree.tostring(country, pretty_print=True) )
I also tried this approach:
ns = {"test" : "urn:test"}
country = etree.Element("country", nsmap=ns)
name = etree.SubElement(country, "{test}name")
name.text = "Canada"
population = etree.SubElement(country, "{test}population")
population.text = "34M"
print( etree.tostring(country, pretty_print=True) )
In both cases, I get something like this out:
<country>
<ns0:name xmlns:ns0="urn:test">Canada</ns0:name>
<ns1:population xmlns:ns1="urn:test">34M</ns1:population>
</country>
While that is correct, I would like it to be less verbose - this can become a real issue with large data sets (and especially because I am using a much larger NS than 'urn:test').
If I am OK with 'country' being inside the "urn:test" namespace and declare it like so (in the first example above):
country = etree.Element("{test}country")
then I get the following output:
<ns0:country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
</ns0:country>
But what I really want is this:
<country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
<country>
Any ideas?
the full name of an element contains of {namespace-url}elementName
, not {prefix}elementName
>>> from lxml import etree as ET
>>> r = ET.Element('root', nsmap={'tst': 'urn:test'})
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x2592a80>
>>> ET.tostring(r)
'<root xmlns:tst="urn:test"><tst:child/></root>'
In your case, even more compact representation might be if you update the default namespace. Unfortunatelly, lxml
does not seem to allow empty XML namespace, but you say, you can put the parent tag into the same namespace as child elements, so you can set the dafault namespace to that of child elements:
>>> r = ET.Element('{urn:test}root', nsmap={None: 'urn:test'})
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x2592b20>
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x25928f0>
>>> ET.tostring(r)
'<root xmlns="urn:test"><child/><child/></root>'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With