I like Python, but I don't want to write 10 lines just to get an attribute from an element. Maybe it's just me, but minidom
isn't that mini
. The code I have to write in order to parse something using it looks a lot like Java code.
Is there something that is more user-friendly
? Something with overloaded operators, and which maps elements to objects?
I'd like to be able to access this :
<root> <node value="30">text</node> </root>
as something like this :
obj = parse(xml_string) print obj.node.value
and not using getChildren
or some other methods like that.
Python allows parsing these XML documents using two modules namely, the xml. etree. ElementTree module and Minidom (Minimal DOM Implementation).
ElementTree is an important Python library that allows you to parse and navigate an XML document.
This type of tree structure is applicable to XML files as well. Therefore, the BeautifulSoup class can also be used to parse XML files directly. The installation of BeautifulSoup has already been discussed at the end of the lesson on Setting up for Python programming.
The Python standard library provides a minimal but useful set of interfaces to work with XML. The two most basic and broadly used APIs to XML data are the SAX and DOM interfaces. Simple API for XML (SAX) − Here, you register callbacks for events of interest and then let the parser proceed through the document.
You should take a look at ElementTree. It's not doing exactly what you want but it's a lot better then minidom. If I remember correctly, starting from python 2.4, it's included in the standard libraries. For more speed use cElementTree. For more more speed (and more features) you can use lxml (check the objectify API for your needs/approach).
I should add that BeautifulSoup do partly what you want. There's also Amara that have this approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With