I am new to Python and I am trying to sort XML with some rules.
My example:
<?xml version="1.0"?>
<data>
<e2 id="3" name="name3">
<e12 num="num12" desc="desc12"/>
<e12 num="num12" desc="desc11"/>
<e11 num="num1" desc="desc1"/>
</e2>
<e2 id="2" name="name2">
<e11 num="num1" desc="desc1"/>
</e2>
<e1 id="1" name="name1">
<e12 num="num12" desc="desc12"/>
<e11 num="num4" desc="desc4"/>
</e1>
</data>
my rules are:
1) sort every attribute by name in respective element
2) sort elements
* by tag name (if no attributes)
* if tag name same by their attribute order
in my case i need to sort first e1 and then e2,
since i have 2 e2 element i need to sort them by their attribute name respectively, like one has id=2 the second one has id=3 so the order should done by id value.
the desired output XML would look like this :
<?xml version="1.0"?>
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4"/>
<e12 desc="desc12" num="num12"/>
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1"/>
</e2>
<e2 id="3" name="name3">
<e11 num="num1" desc="desc1"/>
<e12 desc="desc11" num="num12"/>
<e12 desc="desc12" num="num12"/>
</e2>
</data>
any advice or idea how to do this ?
Thank you.
You can sort your XML with ElementTree. In my example I sort it first by the tag-name and second by the value of the attribut 'name' and the child elements by tag-name and the value of the attribut 'desc'
import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstr))
root = tree.getroot()
# sort the first layer
root[:] = sorted(root, key=lambda child: (child.tag,child.get('name')))
# sort the second layer
for c in root:
c[:] = sorted(c, key=lambda child: (child.tag,child.get('desc')))
xmlstr = ET.tostring(root, encoding="utf-8", method="xml")
print(xmlstr.decode("utf-8"))
this prints
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4" />
<e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
<e11 desc="desc1" num="num1" />
<e12 desc="desc11" num="num12" />
<e12 desc="desc12" num="num12" />
</e2>
</data>
The solution with xml.etree.ElementTree object:
import xml.etree.ElementTree as ET
tree = ET.parse('input.xml')
data = tree.getroot()
els = data.findall("*[@id]") # all e<number> elements having `id` attribute
new_els = sorted(els, key=lambda el: (el.tag, el.attrib['id']))
for el in new_els:
el[:] = sorted(el, key=lambda e: (e.tag, e.attrib['desc']))
data[:] = new_els
tree.write('result.xml', xml_declaration=True, encoding='utf-8')
The final result.xml contents:
<?xml version='1.0' encoding='utf-8'?>
<data>
<e1 id="1" name="name1">
<e11 desc="desc4" num="num4" />
<e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
<e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
<e11 desc="desc1" num="num1" />
<e12 desc="desc11" num="num12" />
<e12 desc="desc12" num="num12" />
</e2>
</data>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With