Is it possible using lxml (or the builtin etree library) to create an object that represents a fragment of xml, but contains two (or more) disjoint trees (i.e. each tree has its own separate root, but they share no common ancestor)?
That is, is there anything that could represent the following without creating another element to hold both of them:
<tree id="A"><anotherelement/></tree>
<tree id="B"><yetanotherelement/></tree>
I can't see anything in the lxml documentation that would allow that, and stackoverflow seems not to have anything directly on point.
The use-case here is that I am generating xml programmatically, and the fragments will be assembled into one document for output. I'd like an object I don't need to iterate over/special case, just pass to the lxml methods as if it were a proper tree.
(I am aware that such fragments would not of themselves be a complete and correct xml document; I want to store the intermediate products before assembly into such a document).
yes, there is such a functionality in the lxml.html
package, it's called fragment_fromstring
or fragments_fromstring
, but in most cases the html parser also handles xml quite well:
from lxml import etree, html
xml = """
<tree id="A"><anotherelement/></tree>
<tree id="B"><yetanotherelement/></tree>
"""
fragments = html.fragments_fromstring(xml)
root = etree.Element("root")
for f in fragments:
root.append(f)
print etree.tostring(root, pretty_print=True)
output:
<root>
<tree id="A">
<anotherelement/>
</tree>
<tree id="B">
<yetanotherelement/>
</tree>
</root>
if you look at what's going on under the hood, it probably wouldn't be too difficult to do the same using the xml parser if you're not happy with the other result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With