libxml2 has a number of advantages:
Downsides include:
If you are doing simple path selection, stick with ElementTree ( which is included in Python 2.5 ). If you need full spec compliance or raw speed and can cope with the distribution of native code, go with libxml2.
Sample of libxml2 XPath Use
import libxml2
doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//*")
if len(res) != 2:
print "xpath query: wrong node set size"
sys.exit(1)
if res[0].name != "doc" or res[1].name != "foo":
print "xpath query: wrong node set value"
sys.exit(1)
doc.freeDoc()
ctxt.xpathFreeContext()
Sample of ElementTree XPath Use
from elementtree.ElementTree import ElementTree
mydoc = ElementTree(file='tst.xml')
for e in mydoc.findall('/foo/bar'):
print e.get('title').text
The lxml package supports xpath. It seems to work pretty well, although I've had some trouble with the self:: axis. There's also Amara, but I haven't used it personally.
Sounds like an lxml advertisement in here. ;) ElementTree is included in the std library. Under 2.6 and below its xpath is pretty weak, but in 2.7+ much improved:
import xml.etree.ElementTree as ET
root = ET.parse(filename)
result = ''
for elem in root.findall('.//child/grandchild'):
# How to make decisions based on attributes even in 2.6:
if elem.attrib.get('name') == 'foo':
result = elem.text
break
Use LXML. LXML uses the full power of libxml2 and libxslt, but wraps them in more "Pythonic" bindings than the Python bindings that are native to those libraries. As such, it gets the full XPath 1.0 implementation. Native ElemenTree supports a limited subset of XPath, although it may be good enough for your needs.
Another option is py-dom-xpath, it works seamlessly with minidom and is pure Python so works on appengine.
import xpath
xpath.find('//item', doc)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With