I've called elems = xmldoc.getElementsByTagName('myTagName')
on an XML object that I parsed as minidom.parse(xmlObj)
. Now I'm trying to get the text content of this element, and although I spent a while looking through the dir() and trying things out, I haven't found the call yet. As an example of what I want to accomplish, in:
<myTagName> Hello there </myTagName>
I would like the extract just "Hello there". (obviously I could parse this myself but I expect there is some built-in functionality)
Thanks
To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().
xml. dom. minidom is a minimal implementation of the Document Object Model interface, with an API similar to that in other languages. It is intended to be simpler than the full DOM and also significantly smaller.
There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.
wait a mo... do you want ALL the text under a given node? It has then to involve a subtree traversal function of some kind. Doesn't have to be recursive but this works fine:
def get_all_text( node ):
if node.nodeType == node.TEXT_NODE:
return node.data
else:
text_string = ""
for child_node in node.childNodes:
text_string += get_all_text( child_node )
return text_string
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With