This question is a follow up to this answer: https://stackoverflow.com/a/51972010/3480297
I'm trying to remove the namespace from an XML file. The linked answer works fine when there are no comments in the XML. However, if there is a comment, an error is thrown.
This is an example of my code:
from lxml import etree
input_xml = '''
<package xmlns="http://apple.com/itunes/importer">
<provider>some data <!-- example comment--> </provider>
<language>en-GB</language>
</package>
'''
root = etree.fromstring(input_xml)
# Remove namespace prefixes
for elem in root.getiterator():
elem.tag = etree.QName(elem).localname
# Remove unused namespace declarations
etree.cleanup_namespaces(root)
print(etree.tostring(root).decode())
This throws the following error:
ValueError: Invalid input tag of type class <'cython_function_or_method'>
EDIT:
If I have the following "input_xml" structure, not all the namespaces are taken out using the code in the below answer.
<package xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://com/scheme/location/example/ Location.xsd ">
<provider>some data <!-- example comment--> </provider>
<language>en-GB</language>
</package>
The result of the code is still:
<package xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://com/scheme/location/example/ Location.xsd ">
<provider>some data <!-- example comment--> </provider>
<language>en-GB</language>
</package>
Make sure that the node is not a comment before changing the tag. The code below also removes any attributes that are in a namespace.
for elem in root.getiterator():
# For elements, replace qualified name with localname
if not(type(elem) == etree._Comment):
elem.tag = etree.QName(elem).localname
# Remove attributes that are in a namespace
for attr in elem.attrib:
if "{" in attr:
elem.attrib.pop(attr)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With