Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python sort XML elements by and tag and attributes recursively

I am new to Python and I am trying to sort XML with some rules.
My example:

<?xml version="1.0"?>
<data>
    <e2 id="3" name="name3">
        <e12 num="num12" desc="desc12"/>
        <e12 num="num12" desc="desc11"/>
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e2 id="2" name="name2">
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e1 id="1" name="name1">
        <e12 num="num12" desc="desc12"/>
        <e11 num="num4" desc="desc4"/>
    </e1>
</data>

my rules are:
1) sort every attribute by name in respective element
2) sort elements
* by tag name (if no attributes)
* if tag name same by their attribute order

in my case i need to sort first e1 and then e2,
since i have 2 e2 element i need to sort them by their attribute name respectively, like one has id=2 the second one has id=3 so the order should done by id value.
the desired output XML would look like this :

<?xml version="1.0"?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4"/>
        <e12 desc="desc12" num="num12"/>
    </e1>
    <e2 id="2" name="name2">
        <e11 desc="desc1" num="num1"/>
    </e2>
    <e2 id="3" name="name3">
        <e11 num="num1" desc="desc1"/>
        <e12 desc="desc11" num="num12"/>
        <e12 desc="desc12" num="num12"/>
    </e2>
</data>

any advice or idea how to do this ?
Thank you.

like image 301
Deniz Avatar asked Jun 07 '26 05:06

Deniz


2 Answers

You can sort your XML with ElementTree. In my example I sort it first by the tag-name and second by the value of the attribut 'name' and the child elements by tag-name and the value of the attribut 'desc'

import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstr))
root = tree.getroot()

# sort the first layer
root[:] = sorted(root, key=lambda child: (child.tag,child.get('name')))

# sort the second layer
for c in root:
    c[:] = sorted(c, key=lambda child: (child.tag,child.get('desc')))

xmlstr = ET.tostring(root, encoding="utf-8", method="xml")
print(xmlstr.decode("utf-8"))

this prints

<data>
<e1 id="1" name="name1">
    <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
    <e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
    <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
    <e12 desc="desc12" num="num12" />
</e2>
</data>
like image 112
Leonhard Triendl Avatar answered Jun 10 '26 08:06

Leonhard Triendl


The solution with xml.etree.ElementTree object:

import xml.etree.ElementTree as ET

tree = ET.parse('input.xml')
data = tree.getroot()
els = data.findall("*[@id]")   # all e<number> elements having `id` attribute
new_els = sorted(els, key=lambda el: (el.tag, el.attrib['id']))
for el in new_els:
    el[:] = sorted(el, key=lambda e: (e.tag, e.attrib['desc']))
data[:] = new_els

tree.write('result.xml', xml_declaration=True, encoding='utf-8')

The final result.xml contents:

<?xml version='1.0' encoding='utf-8'?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
        </e1>
<e2 id="2" name="name2">
        <e11 desc="desc1" num="num1" />
    </e2>
    <e2 id="3" name="name3">
        <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
        <e12 desc="desc12" num="num12" />
        </e2>
    </data>
like image 33
RomanPerekhrest Avatar answered Jun 10 '26 09:06

RomanPerekhrest



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!