Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lxml Create XML fragment with no root element?

Tags:

python

xml

lxml

Is it possible using lxml (or the builtin etree library) to create an object that represents a fragment of xml, but contains two (or more) disjoint trees (i.e. each tree has its own separate root, but they share no common ancestor)?

That is, is there anything that could represent the following without creating another element to hold both of them:

<tree id="A"><anotherelement/></tree>
<tree id="B"><yetanotherelement/></tree>

I can't see anything in the lxml documentation that would allow that, and stackoverflow seems not to have anything directly on point.

The use-case here is that I am generating xml programmatically, and the fragments will be assembled into one document for output. I'd like an object I don't need to iterate over/special case, just pass to the lxml methods as if it were a proper tree.

(I am aware that such fragments would not of themselves be a complete and correct xml document; I want to store the intermediate products before assembly into such a document).

like image 890
Marcin Avatar asked May 12 '12 14:05

Marcin


1 Answers

yes, there is such a functionality in the lxml.html package, it's called fragment_fromstring or fragments_fromstring, but in most cases the html parser also handles xml quite well:

from lxml import etree, html

xml = """
    <tree id="A"><anotherelement/></tree>
    <tree id="B"><yetanotherelement/></tree>
"""

fragments = html.fragments_fromstring(xml)

root = etree.Element("root")
for f in fragments:
    root.append(f)

print etree.tostring(root, pretty_print=True)

output:

<root>
  <tree id="A">
    <anotherelement/>
  </tree>
  <tree id="B">
    <yetanotherelement/>
  </tree>
</root>

if you look at what's going on under the hood, it probably wouldn't be too difficult to do the same using the xml parser if you're not happy with the other result.

like image 124
mata Avatar answered Nov 13 '22 17:11

mata