Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pretty XML printer with lxml

After reading from an existing file with 'ugly' XML and doing some modifications, pretty printing doesn't work. I've tried etree.write(FILE_NAME, pretty_print=True).

I have the following XML:

<testsuites tests="14" failures="0" disabled="0" errors="0" time="0.306" name="AllTests">     <testsuite name="AIR" tests="14" failures="0" disabled="0" errors="0" time="0.306"> .... 

And I use it like this:

tree = etree.parse('original.xml') root = tree.getroot()  ...     # modifications ...  with open(FILE_NAME, "w") as f:     tree.write(f, pretty_print=True) 
like image 934
prosseek Avatar asked Feb 23 '11 04:02

prosseek


People also ask

How do I print a pretty XML string in Python?

If your xml exists as a minidom node, you can use the toprettyxml() function. If it really only ever exists as a string, you will have to parse it in before you can pretty print it out.


2 Answers

For me, this issue was not solved until I noticed this little tidbit here:

http://lxml.de/FAQ.html#why-doesn-t-the-pretty-print-option-reformat-my-xml-output

Short version:

Read in the file with this command:

>>> parser = etree.XMLParser(remove_blank_text=True) >>> tree = etree.parse(filename, parser) 

That will "reset" the already existing indentation, allowing the output to generate it's own indentation correctly. Then pretty_print as normal:

>>> tree.write(<output_file_name>, pretty_print=True) 
like image 157
woodm1979 Avatar answered Sep 24 '22 01:09

woodm1979


Well, according to the API docs, there is no method "write" in the lxml etree module. You've got a couple of options in regards to getting a pretty printed xml string into a file. You can use the tostring method like so:

f = open('doc.xml', 'w') f.write(etree.tostring(root, pretty_print=True)) f.close() 

Or, if your input source is less than perfect and/or you want more knobs and buttons to configure your out put you could use one of the python wrappers for the tidy lib.

http://utidylib.berlios.de/

import tidy f.write(tidy.parseString(your_xml_str, **{'output_xml':1, 'indent':1, 'input_xml':1})) 

http://countergram.com/open-source/pytidylib

from tidylib import tidy_document document, errors = tidy_document(your_xml_str, options={'output_xml':1, 'indent':1, 'input_xml':1}) f.write(document) 
like image 22
Philip Southam Avatar answered Sep 22 '22 01:09

Philip Southam