Assume that I've the following XML which I want to modify using Python's ElementTree
:
<root xmlns:prefix="URI">
<child company:name="***"/>
...
</root>
I'm doing some modification on the XML file like this:
import xml.etree.ElementTree as ET
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')
Then the XML file looks like:
<root xmlns:ns0="URI">
<child ns0:name="***"/>
...
</root>
As you can see, the namepsace prefix
changed to ns0
. I'm aware of using ET.register_namespace()
as mentioned here.
The problem with ET.register_namespace()
is that:
prefix
and URI
e.g. If the xml looks like:
<root xmlns="http://uri">
<child name="name">
...
</child>
</root>
It will be transfomed to something like:
<ns0:root xmlns:ns0="http://uri">
<ns0:child name="name">
...
</ns0:child>
</ns0:root>
As you can see, the default namespace is changed to ns0
.
Is there any way to solve this problem with ElementTree
?
Python XML Parsing Modules Python allows parsing these XML documents using two modules namely, the xml. etree. ElementTree module and Minidom (Minimal DOM Implementation).
An XML namespace is a collection of names that can be used as element or attribute names in an XML document. The namespace qualifies element names uniquely on the Web in order to avoid conflicts between elements with the same name.
Parsing from strings and files. lxml. etree supports parsing XML in a number of ways and from all important sources, namely strings, files, URLs (http/ftp) and file-like objects. The main parse functions are fromstring() and parse(), both called with the source as first argument.
ElementTree will replace those namespaces' prefixes that are not registered with ET.register_namespace
. To preserve a namespace prefix, you need to register it first before writing your modifications on a file. The following method does the job and registers all namespaces globally,
def register_all_namespaces(filename):
namespaces = dict([node for _, node in ET.iterparse(filename, events=['start-ns'])])
for ns in namespaces:
ET.register_namespace(ns, namespaces[ns])
This method should be called before ET.parse
method, so that the namespaces will remain as unchanged,
import xml.etree.ElementTree as ET
register_all_namespaces('filename.xml')
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With