Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove namespace from XML with comment - Python

Tags:

python

xml

lxml

This question is a follow up to this answer: https://stackoverflow.com/a/51972010/3480297

I'm trying to remove the namespace from an XML file. The linked answer works fine when there are no comments in the XML. However, if there is a comment, an error is thrown.

This is an example of my code:

from lxml import etree

input_xml = '''
<package xmlns="http://apple.com/itunes/importer">
  <provider>some data <!-- example comment--> </provider>
  <language>en-GB</language>
</package>
'''
root = etree.fromstring(input_xml)

# Remove namespace prefixes
for elem in root.getiterator():
    elem.tag = etree.QName(elem).localname
# Remove unused namespace declarations
etree.cleanup_namespaces(root)

print(etree.tostring(root).decode())

This throws the following error:

ValueError: Invalid input tag of type class <'cython_function_or_method'>

EDIT:

If I have the following "input_xml" structure, not all the namespaces are taken out using the code in the below answer.

<package xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://com/scheme/location/example/ Location.xsd ">
  <provider>some data <!-- example comment--> </provider>
  <language>en-GB</language>
</package>

The result of the code is still:

<package xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://com/scheme/location/example/ Location.xsd ">
  <provider>some data <!-- example comment--> </provider>
  <language>en-GB</language>
</package>
like image 868
Adam Avatar asked Jun 30 '26 06:06

Adam


1 Answers

Make sure that the node is not a comment before changing the tag. The code below also removes any attributes that are in a namespace.

for elem in root.getiterator():
    # For elements, replace qualified name with localname
    if not(type(elem) == etree._Comment):
        elem.tag = etree.QName(elem).localname

    # Remove attributes that are in a namespace
    for attr in elem.attrib:
        if "{" in attr:
            elem.attrib.pop(attr)
like image 77
mzjn Avatar answered Jul 01 '26 18:07

mzjn