Get Xpath dynamically using ElementTree getpath()

Tags:

I need to write a dynamic function that finds elements on a subtree of an ATOM xml by building dynamically the XPath to the element.

To do so, I've written something like this:

    tree = etree.parse(xmlFileUrl)
    e = etree.XPathEvaluator(tree, namespaces={'def':'http://www.w3.org/2005/Atom'})
    entries = e('//def:entry')
    for entry in entries:
        mypath = tree.getpath(entry) + "/category"
        category = e(mypath)

The code above fails to find "category" because getpath() returns an XPath without namespaces, whereas the XPathEvaluator e() requires namespaces.

Although I know I can use the path and provide a namespace in the call to XPathEvaluator, I would like to know if it's possible to make getpath() return a "fully qualified" path, using all the namespaces, as this is convenient in certain cases.

(This is a spin-off question of my earlier question: Python XpathEvaluator without namespace)

861

asked Oct 30 '12 09:10

puntofisso

2 Answers

Basically, using the standard Python's xml.etree library, a different visit function is needed. To achieve this scope you can build a modified version of iter method like this:

def etree_iter_path(node, tag=None, path='.'):
    if tag == "*":
        tag = None
    if tag is None or node.tag == tag:
        yield node, path
    for child in node:
        _child_path = '%s/%s' % (path, child.tag)
        for child, child_path in etree_iter_path(child, tag, path=_child_path):
            yield child, child_path

Then you can use this function for the iteration of the tree from the root node:

from xml.etree import ElementTree

xmldoc = ElementTree.parse(*path to xml file*)
for elem, path in etree_iter_path(xmldoc.getroot()):
    print(elem, path)

199

answered Oct 05 '22 00:10

Davide Brunato

Rather than trying to construct a full path from the root, you can evaluate XPath expression on with the entry as the base node:

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
entries_expr = etree.XPath('//def:entry', namespaces=nsmap)
category_expr = etree.XPath('category')
for entry in entries_expr(tree):
    category = category_expr(entry)

If performance is not critical, you can simplify the code by using the .xpath() method on elements rather than pre-compiled expressions:

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
for entry in tree.xpath('//def:entry', namespaces=nsmap):
    category = entry.xpath('category')

answered Oct 05 '22 00:10

Simon Sapin

Related questions
                            
                                python multiprocessing - Best way to initialize/pass database connection to be used across processes
                            
                                How to validate unicode "word characters" in Python regex?
                            
                                Is it possible to get to know which script a python process is running? [duplicate]
                            
                                Parse time string in python [duplicate]
                            
                                time.sleep hangs multithread function in python
                            
                                How can I split a text file based on comment blocks in Python?
                            
                                Import Python Module through C# .NET using IronPython
                            
                                Sympy - Rationalize all numeric values in expression? (Like Mathematica's Rationalize[])
                            
                                Is there a faster way to get input in python?
                            
                                Pip install -r requirements.txt on Git Pull?
                            
                                Best approach to pass and handle arguments to function
                            
                                PostgreSQL for Django on Elastic Beanstalk
                            
                                A Single Set of Double Quotes for Python Comments?
                            
                                Exception statement not covered by self.assertRaises in python unit test cases
                            
                                Empty model in z3
                            
                                How do I send a POST using 2-legged oauth2 in python?
                            
                                Errors in program for calculating integrals
                            
                                Weird Extra Looping
                            
                                Does django + mod_wsgi require threaded programming discipline?
                            
                                What is the interface for python iterators? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get Xpath dynamically using ElementTree getpath()

Tags:

python

xpath

lxml

elementtree

puntofisso

People also ask

2 Answers

Davide Brunato

Simon Sapin

Recent Activity

Donate For Us