Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatting inserted elements using python xml.etree module, to include new lines

I am inserting a single element into a large xml file. I want the inserted element to be at the top (so I need to use the root.insert method, and can't just append to the file). I would also like the formatting of the element to match the rest of the file.

The original XML file has the format

<a>
    <b>
        <c/>
    </b>
    <d>
        <e/>
    </d>
    ....
</a>

I then run the following code:

import xml.etree.ElementTree as ET    

xmlfile = ET.parse('file.xml')
a = xmlfile.getroot()

f = ET.Element('f')
g = ET.SubElement(f,'g')

a.insert(1, f)

xmlfile.write('file.xml')

Which creates an output in the form:

<a>
    <b>
        <c/>
    </b>
    <f><g/></f><d>
        <e/>
    </d>
    ....
</a>

but I would like it in the form:

<a>
    <b>
        <c/>
    </b>
    <f>
        <g/>
    </f>
    <d>
        <e/>
    </d>
    ....
</a>

Using Jonathan Eunice's solution to the question 'How do I get Python's ElementTree to pretty print to an XML file?' I have added the following code to replace the xmlfile.write command:

from xml.dom import minidom
xmlstr = minidom.parseString(ET.tostring(a)).toprettyxml(indent="   ")
with open("New_Database.xml", "w") as f:
    f.write(xmlstr)

However the formatting for the whole file is still not correct. It formats the new element correctly, but the original elements are now spaced out:

<b>


    <c/>


</b>


<f>
    <g/>
</f>
<c>


    <d/>


</c>
....
</a>

I think this is because toprettyxml() command adds a new line at the '\n' delimiter (hence adds 2 new lines to the current formatting). Fiddling with the inputs just changes whether the added element or the original elements are formatted incorrectly. I need a method to modify the new element or the original elements before I add the new one in, so that their formatting is the same, then I can reformat the whole lot before printing? Is it possible to add formatting using 'xml.etree.ElementTree'?

Thanks in advance.

like image 982
thisiscomplex Avatar asked Oct 19 '22 17:10

thisiscomplex


1 Answers

It is possible to fiddle with the whitespace using the text and tail properties. Perhaps this is good enough for you. See demo below.

Input document:

<a>
    <b>
        <c/>
    </b>
    <d>
        <e/>
    </d>
</a>

Code:

import xml.etree.ElementTree as ET    

xmlfile = ET.parse('file.xml')
a = xmlfile.getroot()

f = ET.Element('f')
g = ET.SubElement(f,'g')

f.tail = "\n    "
f.text = "\n        "
g.tail = "\n    "

a.insert(1, f)

print ET.tostring(a)

Output:

<a>
    <b>
        <c />
    </b>
    <f>
        <g />
    </f>
    <d>
        <e />
    </d>
</a>
like image 133
mzjn Avatar answered Oct 21 '22 06:10

mzjn