I have an XML to parse which is proving really tricky for me.
<bundles>
<bundle>
<bitstreams>
<bitstream>
<id>1234</id>
</bitstream>
</bitstream>
<name>FOO</name>
</bundle>
<bundle> ... </bundle>
</bundles>
I would like to iterate through this XML and locate all the id values inside of the bitstreams for a bundle where the name element's value is 'FOO'. I'm not interested in any bundles not named 'FOO', and there may be any number of bundles and any number of bitstreams in the bundles.
I have been using tree.findall('./bundle/name') to find the FOO bundle but this just returns a list that I can't step through for the id values:
for node in tree.findall('./bundle/name'):
if node.text == 'FOO':
id_values = tree.findall('./bundle/bitstreams/bitstream/id')
for value in id_values:
print value.text
This prints out all the id values, not those of the bundle 'FOO'.
How can I iterate through this tree, locate the bundle with the name FOO, take this bundle node and collect the id values nested in it? Is the XPath argument incorrect here?
I'm working in Python, with lxml bindings - but any XML parser I believe would be alright; these aren't large XML trees.
You can use xpath to achieve the purpose. Following Python code works perfect:
import libxml2
data = """
<bundles>
<bundle>
<bitstreams>
<bitstream>
<id>1234</id>
</bitstream>
</bitstreams>
<name>FOO</name>
</bundle>
</bundles>
"""
doc = xmllib2.parseDoc(data)
for node in doc.xpathEval('/bundles/bundle/name[.="FOO"]/../bitstreams/bitstream/id'):
print node
or using lxml (data is the same as in the example above):
from lxml import etree
bundles = etree.fromstring(data)
for node in bundles.xpath('bundle/name[.="FOO"]/../bitstreams/bitstream/id'):
print(node.text)
outputs:
1234
If the <bitstreams> element always precedes the <name> element, you can also use the more efficient xpath expression:
'bundle/name[.="FOO"]/preceding-sibling::bitstreams/bitstream/id'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With