Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Xpath dynamically using ElementTree getpath()

I need to write a dynamic function that finds elements on a subtree of an ATOM xml by building dynamically the XPath to the element.

To do so, I've written something like this:

    tree = etree.parse(xmlFileUrl)
    e = etree.XPathEvaluator(tree, namespaces={'def':'http://www.w3.org/2005/Atom'})
    entries = e('//def:entry')
    for entry in entries:
        mypath = tree.getpath(entry) + "/category"
        category = e(mypath)

The code above fails to find "category" because getpath() returns an XPath without namespaces, whereas the XPathEvaluator e() requires namespaces.

Although I know I can use the path and provide a namespace in the call to XPathEvaluator, I would like to know if it's possible to make getpath() return a "fully qualified" path, using all the namespaces, as this is convenient in certain cases.

(This is a spin-off question of my earlier question: Python XpathEvaluator without namespace)

like image 861
puntofisso Avatar asked Oct 30 '12 09:10

puntofisso


People also ask

How to identify the element using XPath functions by attribute?

Identifying The Element Using The XPath functions By Attribute. We use the XPath functions (contains or starts-with) with attribute when there are some uniquely identified attribute values available in the container tag. Attributes are accessed using the “@” symbol. This can be understood clearly with the given example: Login to Google

Is it possible to write dynamic XPath?

At times, XPath may change dynamically and we need to handle the elements while writing scripts. The standard way of writing XPath may not work and we need to write dynamic XPath in selenium scripts. What are XPath axes?

How to find web elements using XPath in selenium?

Here we will show you how to find web elements using XPath in Selenium manually and easily.+ One of the useful functionalities in Selenium is locators. Locators allow us to find web elements. If the web elements are not found by the locators such as id, classname, name, link text etc., then we use XPath to find web elements on the web page.

How to find the “log in to Twitter” Web Element Using XPath?

XPath functions like contains (), starts-with (), and text () when used with the help of text “Log in to Twitter” would help us identify the element correctly, and further operations can be performed on the same. Syntax: To find the “Log in to Twitter” web element, use any of the following XPath expressions that include contains () method.


2 Answers

Basically, using the standard Python's xml.etree library, a different visit function is needed. To achieve this scope you can build a modified version of iter method like this:

def etree_iter_path(node, tag=None, path='.'):
    if tag == "*":
        tag = None
    if tag is None or node.tag == tag:
        yield node, path
    for child in node:
        _child_path = '%s/%s' % (path, child.tag)
        for child, child_path in etree_iter_path(child, tag, path=_child_path):
            yield child, child_path

Then you can use this function for the iteration of the tree from the root node:

from xml.etree import ElementTree

xmldoc = ElementTree.parse(*path to xml file*)
for elem, path in etree_iter_path(xmldoc.getroot()):
    print(elem, path)
like image 199
Davide Brunato Avatar answered Oct 05 '22 00:10

Davide Brunato


Rather than trying to construct a full path from the root, you can evaluate XPath expression on with the entry as the base node:

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
entries_expr = etree.XPath('//def:entry', namespaces=nsmap)
category_expr = etree.XPath('category')
for entry in entries_expr(tree):
    category = category_expr(entry)

If performance is not critical, you can simplify the code by using the .xpath() method on elements rather than pre-compiled expressions:

tree = etree.parse(xmlFileUrl)
nsmap = {'def':'http://www.w3.org/2005/Atom'}
for entry in tree.xpath('//def:entry', namespaces=nsmap):
    category = entry.xpath('category')
like image 21
Simon Sapin Avatar answered Oct 05 '22 00:10

Simon Sapin