Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Ignore xmlns in elementtree.ElementTree

Is there a way to ignore the XML namespace in tage names in elementtree.ElementTree?

I try to print all technicalContact tags:

for item in root.getiterator(tag='{http://www.example.com}technicalContact'):
        print item.tag, item.text

And I get something like:

{http://www.example.com}technicalContact [email protected]

But what I really want is:

technicalContact [email protected]

Is there a way to display only the suffix (sans xmlns), or better - iterate over the elements without explicitly stating xmlns?

like image 416
Adam Matan Avatar asked Jun 27 '12 12:06

Adam Matan


People also ask

What is ElementTree?

ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with.

What does Etree parse do?

etree supports parsing XML in a number of ways and from all important sources, namely strings, files, URLs (http/ftp) and file-like objects. The main parse functions are fromstring() and parse(), both called with the source as first argument.


1 Answers

You can define a generator to recursively search through your element tree in order to find tags which end with the appropriate tag name. For example, something like this:

def get_element_by_tag(element, tag):
    if element.tag.endswith(tag):
        yield element
    for child in element:
        for g in get_element_by_tag(child, tag):
            yield g

This just checks for tags which end with tag, i.e. ignoring any leading namespace. You can then iterate over any tag you want as follows:

for item in get_element_by_tag(elemettree, 'technicalContact'):
    ...

This generator in action:

>>> xml_str = """<root xmlns="http://www.example.com">
... <technicalContact>Test1</technicalContact>
... <technicalContact>Test2</technicalContact>
... </root>
... """

xml_etree = etree.fromstring(xml_str)

>>> for item in get_element_by_tag(xml_etree, 'technicalContact')
...     print item.tag, item.text
... 
{http://www.example.com}technicalContact Test1
{http://www.example.com}technicalContact Test2
like image 106
Chris Avatar answered Oct 02 '22 16:10

Chris