Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Evaluate Xpath2.0 in python

I have an XPath expression as shown below.

if(replace(//p[1]/text(),'H','h') = 'hello') then //p[1]/text() else if(//p[1]/text() = 'world') then //p[2]/text() else 'notFound'

I want to display which 'if ' expression worked.

e.g //p[1]/text() if first 'if' expression worked.

'If' expression can have nested if, for loops and xpath2.0 functions.

I can't find any xpath2.0 library for python. So I tried to convert this Js library to python still I can able to split xpath2.0 expression to lexers but can't convert it fully to python.

Suggest me some Xpath2.0 library for python, if available. Also how to interpret XPath expression and display which part of the expression worked?

like image 529
Er Bharath Ram Avatar asked Oct 08 '18 17:10

Er Bharath Ram


People also ask

What is XPath in python?

XPath is a major element in the XSLT standard. XPath can be used to navigate through elements and attributes in an XML document. XPath stands for XML Path Language. XPath uses "path like" syntax to identify and navigate nodes in an XML document. XPath contains over 200 built-in functions.

What is XPath and how it is used?

The XML Path Language (XPath) is used to uniquely identify or address parts of an XML document. An XPath expression can be used to search through an XML document, and extract information from any part of the document, such as an element or attribute (referred to as a node in XML) in it.

What does XPath do in XML?

XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look very much like the expressions you see when you work with a traditional computer file system. XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.


2 Answers

As you already know, lxml, the cornerstone of Python's XML/XPath support, features only

XPath 1.0, XSLT 1.0 and the EXSLT extensions through libxml2 and libxslt

We still have some options.

I researched this topic recently (specifically, Python's XQuery support).
See the W3C for a reference list of XML Query Implementations.

  1. Python Modules with XPath 2+ and EXSLT Extensions (e.g. an EXSLT for regex matching)
    There are some modules on PiPy that partially offer XPath 2.0+ functionality.

  2. There are some OSS XML/NoSQL-DBMS that implement XPath/XQuery 2.0 functions, e.g.

    • Zorba, an open source portable embeddable C++ implementation of XQuery 1.0/2.0 which has Python bindings (this question has some pointers),
    • as well as Sedna and some commercial DBMS. Depending on your project, this could be a good choice.
  3. I believe Saxon/C (by Michael Kay) with Cython is the most promising road.
    It was tried before using Boost.Python and at pysaxon.
    Update: A Saxon/C extension for Python 3 has been published meanwhile.

  4. You could use a subprocess to call a CLI XML processor (as suggested here), e.g. subprocess.call(["saxon", "-o:output.xml", "-s:file.xml", "file.xslt"])

  5. Another option is using XSLT/XPath/XQuery with saxon and/or other Java XML classes in Jython.

  6. Finally, you could set-up a web service that does the hard work for you in a language like Java, .NET, etc, that comes with proper XPath 3+ support (also suggested by Kay here).

Still somewhat disappointing, especially for a big language like Python.

like image 118
wp78de Avatar answered Nov 09 '22 21:11

wp78de


As mentioned by Martin we have a Saxon product for C/C++/PHP languages, called Saxon/C which out for a few years now. We have been seeing users interested in using Saxon/C with Python. 

Update 2019 A Saxon/C extension for Python3 has now been released : SaxonC - XML document processing for C/C++, PHP and Python

One user has successfully used Boost.Python to interface with our C++ library.  Another user has done the interfacing in a different way: https://github.com/ajelenak/pysaxon

like image 3
ond1 Avatar answered Nov 09 '22 22:11

ond1