Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Joining and writing (XML.etrees) trees stored in a list

I'm looping over some XML files and producing trees that I would like to store in a defaultdict(list) type. With each loop and the next child found will be stored in a separate part of the dictionary.

d = defaultdict(list)
counter = 0
for child in root.findall(something):
    tree = ET.ElementTree(something)
    d[int(x)].append(tree)
    counter += 1

So then repeating this for several files would result in nicely indexed results; a set of trees that were in position 1 across different parsed files and so on. The question is, how do I then join all of d, and write the trees (as a cumulative tree) to a file?

I can loop through the dict to get each tree:

for x in d:
    for y in d[x]:
        print (y)

This gives a complete list of trees that were in my dict. Now, how do I produce one massive tree from this?

Sample input file 1

Sample input file 2

Required results from 1&2

Given the apparent difficulty in doing this, I'm happy to accept more general answers that show how I can otherwise get the result I am looking for from two or more files.

like image 507
BrownE Avatar asked Oct 03 '22 15:10

BrownE


1 Answers

Use lxml.objectify:

from lxml import etree, objectify

obj_1 = objectify.fromstring(open('file1').read())
obj_2 = objectify.fromstring(open('file2').read())
obj_1.Trail.CTrailData.extend(obj_2.Trail.CTrailData)
# .sort() won't work as objectify's lists are not regular python lists.
obj_1.Trail.CTrailData = sorted(obj_1.Trail.CTrailData, key=lambda x: x.index)

print etree.tostring(obj_1, pretty_print=True)

It doesn't do the additional conversion work that the Spyne variant does, but for your use case, that might be enough.

like image 146
Burak Arslan Avatar answered Oct 11 '22 14:10

Burak Arslan