Why does xml package modify my xml file in Python3?

Tags:

I use the xml library in Python3.5 for reading and writing an xml-file. I don't modify the file. Just open and write. But the library modifes the file.

Why is it modified?
How can I prevent this? e.g. I just want to replace specific tag or it's value in a quite complex xml-file without loosing any other informations.

This is the example file

Click to copy

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
    <title>Der Eisbär</title>
    <ids>
        <entry>
            <key>tmdb</key>
            <value xsi:type="xs:int" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">9321</value>
        </entry>
        <entry>
            <key>imdb</key>
            <value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">tt0167132</value>
        </entry>
    </ids>
</movie>

This is the code

Click to copy

import xml.etree.ElementTree as ET
tree = ET.parse('x.nfo')
tree.write('y.nfo', encoding='utf-8')

And the xml-file becomes this

Click to copy

<movie xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <title>Der Eisbär</title>
    <ids>
        <entry>
            <key>tmdb</key>
            <value xsi:type="xs:int">9321</value>
        </entry>
        <entry>
            <key>imdb</key>
            <value xsi:type="xs:string">tt0167132</value>
        </entry>
    </ids>
</movie>

Line 1 is gone.
The <movie>-tag in line 2 has attributes now.
The <value>-tag in line 7 and 11 now has less attributes.

731

asked Aug 31 '17 22:08

buhtz

1 Answers

Note that "xml package" and "the xml library" are ambiguous. There are several XML-related modules in the standard library: https://docs.python.org/3/library/xml.html.

Why is it modified?

ElementTree moves namespace declarations to the root element, and namespaces that aren't actually used in the document are removed.

Why does ElementTree do this? I don't know, but perhaps it is a way to make the implementation simpler.

How can I prevent this? e.g. I just want to replace specific tag or it's value in a quite complex xml-file without loosing any other informations.

I don't think there is a way to prevent this. The issue has been brought up before. Here are two very similar questions with no answers:

How do I parse and write XML using Python's ElementTree without moving namespaces around?
Keep Existing Namespaces when overwriting XML file with ElementTree and Python

My suggestion is to use lxml instead of ElementTree. With lxml, the namespace declarations will remain where they occur in the original file.

Line 1 is gone.

That line is the XML declaration. It is recommended but not mandatory to have one.

If you always want an XML declaration, use xml_declaration=True in the write() method call.

186

answered Sep 28 '22 04:09

mzjn

Related questions
                            
                                Using XmlReader class to parse XML with elements of the same name
                            
                                Basic XQilla XPath example
                            
                                How do I make a valid inline XML schema?
                            
                                Android Maps: Failed to load map. Could not contact Google servers
                            
                                Google maps in an actionbarsherlock tab
                            
                                PowerShell to get attribute values from XML with multiple attributes
                            
                                Check if two XML files are the same in C#?
                            
                                xslt in ASP.NET
                            
                                "Tag start is not closed" when commenting inside XML element's start and end tag
                            
                                XSLT 2.0 to convert CSV to XML format
                            
                                Return xml file from spring MVC controller
                            
                                Canvas drawing not being drawn properly despite setting properties
                            
                                Is it okay to use strings.xml Resource file for storing large text in Android?
                            
                                Ripple effect is not going above the ImageView
                            
                                Yahoo Exchange Rates not working
                            
                                How to add OR condition in LINQ while querying XML?
                            
                                Walmart retrieve stock/quantity and local store by providing zip code and id
                            
                                Order of elements from minidom getElementsByTagName
                            
                                Can xs:anyURI contain square brackets in XSD?
                            
                                javax.xml.bind.PropertyException when jaxb marshalling

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does xml package modify my xml file in Python3?

Tags:

python-3.x

xml

elementtree

buhtz

People also ask

1 Answers

mzjn

Recent Activity

Donate For Us