Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unwanted namespace declaration in lxml XPath

Tags:

python

xpath

lxml

I want to select the first child of particular element (subelement), but this child's namespace is different from parent's namespace. Moreover, this child can be of any namespace.

xml = '''<root xmlns="default_ns">
    <subelement>
        <!-- here we can have an element of any namespace  -->
        <some_prefix:a xmlns:some_prefix="some_namespace">
            <some_prefix:b/>
        </some_prefix:a>
    </subelement>
</root>'''
root = etree.fromstring(xml)
evaluator = etree.XPathEvaluator(root, namespaces={'def':'default_ns'})
child = evaluator.evaluate('//def:subelement/child::*')[0]
a_string = etree.tostring(child)
print a_string

This gives:

<some_prefix:a xmlns:some_prefix="some_namespace" xmlns="default_ns">
    <some_prefix:b/>
</some_prefix:a>

but what I want to get is child without namespace declaration from parent xmlns="default_ns":

<some_prefix:a xmlns:some_prefix="some_namespace">
    <some_prefix:b/>
</some_prefix:a>
like image 445
Marcin Avatar asked Feb 28 '12 21:02

Marcin


People also ask

What is namespace in XPath?

XPath queries are aware of namespaces in an XML document and can use namespace prefixes to qualify element and attribute names. Qualifying element and attribute names with a namespace prefix limits the nodes returned by an XPath query to only those nodes that belong to a specific namespace.

What does XPath return Python?

XPath return values a float, when the XPath expression has a numeric result (integer or float) a 'smart' string (as described below), when the XPath expression has a string result. a list of items, when the XPath expression has a list as result.

What is an XML namespace?

What Is an XML Namespace? An XML namespace is a collection of names that can be used as element or attribute names in an XML document. The namespace qualifies element names uniquely on the Web in order to avoid conflicts between elements with the same name.


1 Answers

but what I want to get is child without namespace declaration from parent xmlns="default_ns".

This is not possible to achieve by only evaluating an XPath expression.

In XML any element inherits all of its parent's namespace nodes, unless it redefines a particular namespace.

This means that some_prefix:a inherits the default namespace "default_ns" from its parent (subelement), which itself inherits this same default namespace node from the top element root.

XPath is a query language for XML documents. As such, it only helps select nodes, but the evaluation of an XPath expression never destroys, adds or alters nodes, including namespace nodes.

Because of this, the default namespace node that belongs to some_prefix:a cannot be destroyed as result of the evaluation of your XPath expression -- thus this namespace node is shown when some_prefix:a is serialized to text.

Solution: Use your favorite PL that hosts XPath, to delete the unwanted namespace node.

For example, if the hosting language is XSLT:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:d="default_ns">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <xsl:apply-templates mode="delNS"
    select="/*/d:subelement/*[1]"/>
 </xsl:template>

 <xsl:template match="*" mode="delNS">
   <xsl:element name="{name()}" namespace="{namespace-uri()}">
    <xsl:copy-of select="namespace::*[name()]"/>
    <xsl:copy-of select="@*"/>
    <xsl:apply-templates mode="delNS" select="node()"/>
   </xsl:element>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<root xmlns="default_ns">
    <subelement>
        <!-- here we can have an element of any namespace  -->
        <some_prefix:a xmlns:some_prefix="some_namespace">
            <some_prefix:b/>
        </some_prefix:a>
    </subelement>
</root>

the wanted, correct result is produced:

<some_prefix:a xmlns:some_prefix="some_namespace">
   <some_prefix:b/>
</some_prefix:a>
like image 161
Dimitre Novatchev Avatar answered Oct 02 '22 18:10

Dimitre Novatchev