Is there a way to ignore the XML namespace in tage names in elementtree.ElementTree
?
I try to print all technicalContact
tags:
for item in root.getiterator(tag='{http://www.example.com}technicalContact'):
print item.tag, item.text
And I get something like:
{http://www.example.com}technicalContact [email protected]
But what I really want is:
technicalContact [email protected]
Is there a way to display only the suffix (sans xmlns), or better - iterate over the elements without explicitly stating xmlns?
ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with.
etree supports parsing XML in a number of ways and from all important sources, namely strings, files, URLs (http/ftp) and file-like objects. The main parse functions are fromstring() and parse(), both called with the source as first argument.
You can define a generator to recursively search through your element tree in order to find tags which end with the appropriate tag name. For example, something like this:
def get_element_by_tag(element, tag):
if element.tag.endswith(tag):
yield element
for child in element:
for g in get_element_by_tag(child, tag):
yield g
This just checks for tags which end with tag
, i.e. ignoring any leading namespace. You can then iterate over any tag you want as follows:
for item in get_element_by_tag(elemettree, 'technicalContact'):
...
This generator in action:
>>> xml_str = """<root xmlns="http://www.example.com">
... <technicalContact>Test1</technicalContact>
... <technicalContact>Test2</technicalContact>
... </root>
... """
xml_etree = etree.fromstring(xml_str)
>>> for item in get_element_by_tag(xml_etree, 'technicalContact')
... print item.tag, item.text
...
{http://www.example.com}technicalContact Test1
{http://www.example.com}technicalContact Test2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With