I use the xml
library in Python3.5 for reading and writing an xml-file. I don't modify the file. Just open and write. But the library modifes the file.
This is the example file
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
<title>Der Eisbär</title>
<ids>
<entry>
<key>tmdb</key>
<value xsi:type="xs:int" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">9321</value>
</entry>
<entry>
<key>imdb</key>
<value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">tt0167132</value>
</entry>
</ids>
</movie>
This is the code
import xml.etree.ElementTree as ET
tree = ET.parse('x.nfo')
tree.write('y.nfo', encoding='utf-8')
And the xml-file becomes this
<movie xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<title>Der Eisbär</title>
<ids>
<entry>
<key>tmdb</key>
<value xsi:type="xs:int">9321</value>
</entry>
<entry>
<key>imdb</key>
<value xsi:type="xs:string">tt0167132</value>
</entry>
</ids>
</movie>
<movie>
-tag in line 2 has attributes now.<value>
-tag in line 7 and 11 now has less attributes.Modify XML files with Python 1 ET.parse (‘Filename’).getroot () -ET.parse (‘fname’)-creates a tree and then we extract the root by .getroot (). 2 ET.fromstring (stringname) -To create a root from an XML data string. More ...
Parsing an XML in Python means loading an XML file or string to a python object, to be able to work with it using pythonic functions. Say you have an ElementTree object as ET. Here are the built-in functions and methods of the element tree for parsing.
ElementTree is a class that wraps the element structure and allows conversion to and from XML. Let us now try to parse the above XML file using the python module. There are two ways to parse the file using ‘ElementTree’ module. The first is by using the parse () function and the second is fromstring () function.
# parsing from the string. # printing attributes of the root tags 'neighbor'. # finding the state tag and their child attributes. Element methods output. Modifying the XML document can also be done through Element methods. 1) Element.set (‘attrname’, ‘value’) – Modifying element attributes.
Note that "xml package" and "the xml
library" are ambiguous. There are several XML-related modules in the standard library: https://docs.python.org/3/library/xml.html.
Why is it modified?
ElementTree moves namespace declarations to the root element, and namespaces that aren't actually used in the document are removed.
Why does ElementTree do this? I don't know, but perhaps it is a way to make the implementation simpler.
How can I prevent this? e.g. I just want to replace specific tag or it's value in a quite complex xml-file without loosing any other informations.
I don't think there is a way to prevent this. The issue has been brought up before. Here are two very similar questions with no answers:
My suggestion is to use lxml instead of ElementTree. With lxml, the namespace declarations will remain where they occur in the original file.
Line 1 is gone.
That line is the XML declaration. It is recommended but not mandatory to have one.
If you always want an XML declaration, use xml_declaration=True
in the write()
method call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With