Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between a ElementTree and an Element? (python xml)

from xml.etree.ElementTree import ElementTree, Element, SubElement, dump

elem = Element('1')
sub = SubElement(elem, '2')
tree = ElementTree(elem)

dump(tree)
dump(elem)

In the code above, dumping tree (which is an ElementTree) and dumping elem (which is an Element) results in the same thing. Therefore I am having trouble determining what the difference is between the two.

like image 411
Michael Avatar asked Jun 12 '15 22:06

Michael


People also ask

What is ElementTree in Python?

ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with. When in doubt, print it out ( print(ET.

How do you parse an XML string in Python?

There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.

What does Etree parse do?

Parsing from strings and files. lxml. etree supports parsing XML in a number of ways and from all important sources, namely strings, files, URLs (http/ftp) and file-like objects. The main parse functions are fromstring() and parse(), both called with the source as first argument.


2 Answers

dumping tree (which is an ElementTree) and dumping elem (which is an Element) results in the same thing.

dump() function works the same for ElementTree and Element because it was intentionally made to behave this way:

def dump(elem):
    # debugging
    if not isinstance(elem, ElementTree):
        elem = ElementTree(elem)
    elem.write(sys.stdout)
    ...

I am having trouble determining what the difference is between the two.

ElementTree is a wrapper class that corresponds to the "entire element hierarchy" providing serialization functionality - dumping and loading the tree. Element, on the other hand, is a much "bigger" class that defines the Element interface.

like image 122
alecxe Avatar answered Nov 02 '22 23:11

alecxe


The ElementTree wrapper class is used to read and write XML files [ref]. Most ElementTree apis are simple wrappers around the root Element [ref]. Simply put, ElementTree wraps the root Element (for convenience) and provides methods to serialize/deserialize the entire tree. Hence parse() belongs to ElementTree where iter() is a simple wrapper.

Then there are helper functions like iterparse and dump() in the xml.etree.ElementTree namespace. dump() writes a full xml doc to stdout [ref] whereas iterparse spits out Elements iteratively. Contrast parse(), which returns an xml.etree.ElementTree.ElementTree object and hence a complete hierarchy, to iterparse(), which returns an iterator[1].

1 There might be some confusion between xml.etree.ElementTree package namespace and xml.etree.ElementTree.ElementTree class name.

like image 21
akhan Avatar answered Nov 02 '22 23:11

akhan