Suppose I have an XML string:
<A>
<B foo="123">
<C>thing</C>
<D>stuff</D>
</B>
</A>
and I want to insert a namespace of the type used by XML Schema, putting a prefix in front of all the element names.
<A xmlns:ns1="www.example.com">
<ns1:B foo="123">
<ns1:C>thing</ns1:C>
<ns1:D>stuff</ns1:D>
</ns1:B>
</A>
Is there a way to do this (aside from brute-force find-replace or regex) using lxml.etree or a similar library?
I don't think this can be done with just ElementTree.
Manipulating namespaces is sometimes surprisingly tricky. There are many questions about it here on SO. Even with the more advanced lxml library, it can be really hard. See these related questions:
Below is a solution that uses XSLT.
Code:
from lxml import etree
XML = '''
<A>
<B foo="123">
<C>thing</C>
<D>stuff</D>
</B>
</A>'''
XSLT = '''
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="www.example.com">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="*">
<xsl:element name="ns1:{name()}">
<xsl:apply-templates select="node()|@*"/>
</xsl:element>
</xsl:template>
<!-- No prefix on the A element -->
<xsl:template match="A">
<A xmlns:ns1="www.example.com">
<xsl:apply-templates select="node()|@*"/>
</A>
</xsl:template>
</xsl:stylesheet>'''
xml_doc = etree.fromstring(XML)
xslt_doc = etree.fromstring(XSLT)
transform = etree.XSLT(xslt_doc)
print transform(xml_doc)
Output:
<A xmlns:ns1="www.example.com">
<ns1:B foo="123">
<ns1:C>thing</ns1:C>
<ns1:D>stuff</ns1:D>
</ns1:B>
</A>
Use ET.register_namespace('ns1', 'www.example.com') to register the namespace with ElementTree. This is needed so write() uses the registered prefix. (I have code that uses a prefix of '' (an empty string) for the default namespace)
Then prefix each element name with {www.example.com}. For example: root.find('{www.example.com}B').
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With