Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatting the output as XML with lxml

My program basically read an input file, makes an lxml.etree from that file, than for example I add a node to the etree and then I want to print it back on a file. So to write it back on a file I use:

et.write('Documents\Write.xml', pretty_print=True)

And the output I have is:

<Variable Name="one" RefID="two"><Component Type="three"><Value>four</Value></Component></Variable>

While I'd like something like:

<Variable Name="one" RefID="two">
    <Component Type="three">
        <Value>four</Value>
    </Component> 
</Variable>

Where am I mistaken? I've tried many solutions but none seems to work (beautifulsoup, tidy, parser...)

like image 995
JAWE Avatar asked Jul 18 '13 07:07

JAWE


1 Answers

Don't use the standard parser. Use a custom parser with remove_blank_text=True.

parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(self.output_file, parser=parser)
# Do stuff with the tree here
tree.write(your_output_file, pretty_print=True)
like image 145
tymm Avatar answered Sep 20 '22 00:09

tymm