I have an xml file that I'm using etree from lxml to work with, but when I add tags to it, pretty printing doesn't seem to work. <pre class="prettyprint"><code>>>> from lxml import etree >>> root = etree.parse('file.xml').getroot() >>> print etree.tostring(root, pretty_print = True) <root> <x> <y>test1</y> </x> </root> </code></pre> So far so good. But now <pre class="prettyprint"><code>>>> x = root.find('x') >>> z = etree.SubElement(x, 'z') >>> etree.SubElement(z, 'z1').attrib['value'] = 'val1' >>> print etree.tostring(root, pretty_print = True) <root> <x> <y>test1</y> <z><z1 value="val1"/></z></x> </root> </code></pre> it's no longer pretty. I've also tried to do it "backwards" where I create the z1 tag, then create the z tag and append z1 to it, then append the z tag to the x tag. But I get the same result. If I don't parse the file and just create all the tags in one go, it'll print correctly. So I think it has something to do with parsing the file. How can I get pretty printing to work?

It has to do with how <code>lxml</code> treats whitespace -- see the lxml FAQ for details. To fix this, change the loading part of the file to the following: <pre class="prettyprint"><code>parser = etree.XMLParser(remove_blank_text=True) root = etree.parse('file.xml', parser).getroot() </code></pre> I didn't test it, but it should indent your file just fine with this change.

Pretty print in lxml is failing when I add tags to a parsed tree

Tags:

python

pretty-print

parsing

xml

lxml

I have an xml file that I'm using etree from lxml to work with, but when I add tags to it, pretty printing doesn't seem to work.

>>> from lxml import etree >>> root = etree.parse('file.xml').getroot() >>> print etree.tostring(root, pretty_print = True)  <root>   <x>     <y>test1</y>   </x> </root>

So far so good. But now

>>> x = root.find('x') >>> z = etree.SubElement(x, 'z') >>> etree.SubElement(z, 'z1').attrib['value'] = 'val1' >>> print etree.tostring(root, pretty_print = True)  <root>   <x>     <y>test1</y>   <z><z1 value="val1"/></z></x> </root>

it's no longer pretty. I've also tried to do it "backwards" where I create the z1 tag, then create the z tag and append z1 to it, then append the z tag to the x tag. But I get the same result.

If I don't parse the file and just create all the tags in one go, it'll print correctly. So I think it has something to do with parsing the file.

How can I get pretty printing to work?

278

asked Oct 26 '11 14:10

Kris Harper

1 Answers

It has to do with how lxml treats whitespace -- see the lxml FAQ for details.

To fix this, change the loading part of the file to the following:

parser = etree.XMLParser(remove_blank_text=True) root = etree.parse('file.xml', parser).getroot()

I didn't test it, but it should indent your file just fine with this change.

170

answered Sep 29 '22 18:09

jro

Related questions
                            
                                in Numpy, how to zip two 2-D arrays?
                            
                                VSCode's debugging mode always stop at first line
                            
                                Sklearn StratifiedKFold: ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead
                            
                                "lambda" vs. "operator.attrgetter('xxx')" as a sort key function
                            
                                How do I exclude a few columns from a DataFrame plot?
                            
                                How can I capture return value with Python timeit module?
                            
                                Python pandas apply function if a column value is not NULL
                            
                                In the Python debugger pdb, how do you exit interactive mode without terminating the debugging session
                            
                                round() returns different result depending on the number of arguments
                            
                                Pandas, Future Warning: Indexing with multiple keys
                            
                                How to define a __str__ method for a class?
                            
                                What error to raise when class state is invalid?
                            
                                print() vs sys.stdout.write(): which and why?
                            
                                Convert python list with None values to numpy array with nan values
                            
                                brew install python, but then: "python-2.7.6 already installed, it's just not linked"
                            
                                How to handle FileNotFoundError when "try .. except IOError" does not catch it?
                            
                                Redirect print command in python script through tqdm.write()
                            
                                How to use digit separators for Python integer literals?
                            
                                Running a Jupyter notebook from another notebook
                            
                                python typing signature (typing.Callable) for function with kwargs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With