I have many rows in a database that contains XML and I'm trying to write a Python script to count instances of a particular node attribute.
My tree looks like:
<foo> <bar> <type foobar="1"/> <type foobar="2"/> </bar> </foo> How can I access the attributes "1" and "2" in the XML using Python?
minidom is the quickest and pretty straight forward.
XML:
<data> <items> <item name="item1"></item> <item name="item2"></item> <item name="item3"></item> <item name="item4"></item> </items> </data> Python:
from xml.dom import minidom xmldoc = minidom.parse('items.xml') itemlist = xmldoc.getElementsByTagName('item') print(len(itemlist)) print(itemlist[0].attributes['name'].value) for s in itemlist: print(s.attributes['name'].value) Output:
4 item1 item1 item2 item3 item4
I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree defines.
First build an Element instance root from the XML, e.g. with the XML function, or by parsing a file with something like:
import xml.etree.ElementTree as ET root = ET.parse('thefile.xml').getroot() Or any of the many other ways shown at ElementTree. Then do something like:
for type_tag in root.findall('bar/type'): value = type_tag.get('foobar') print(value) And similar, usually pretty simple, code patterns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With