Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xpath: filter out childs

I'm looking for a xpath expression that filters out certain childs. A child must contain a CCC node with B in it.

Source:

<AAA>
    <BBB1>
        <CCC>A</CCC>
    </BBB1>       
    <BBB2>
        <CCC>A</CCC>
    </BBB2>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

This should be the result:


<AAA>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

Hopefully someone can help me.

Jos

like image 615
Jos Avatar asked Mar 03 '11 11:03

Jos


2 Answers

XPath is a query language for XML documents. As such it can only select nodes from existing XML document(s) -- it cannot modify an XML document or create a new XML document.

Use XSLT in order to transform an XML document and create a new XML document from it.

In this particular case:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*/*[not(CCC = 'B')]"/>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<AAA>
    <BBB1>
        <CCC>A</CCC>
    </BBB1>
    <BBB2>
        <CCC>A</CCC>
    </BBB2>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

the wanted, correct result is produced:

<AAA>
   <BBB3>
      <CCC>B</CCC>
   </BBB3>
   <BBB4>
      <CCC>B</CCC>
   </BBB4>
</AAA>
like image 159
Dimitre Novatchev Avatar answered Sep 23 '22 02:09

Dimitre Novatchev


In order to select all of the desired element and text nodes, use this XPATH:

//node()[.//CCC[.='B']
      or self::CCC[.='B']
      or self::text()[parent::CCC[.='B']]]

This could be achieved with a more simply/easily using XPATH with a modified identity transform XSLT:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" />

    <!--Empty template for the content we want to redact -->
    <xsl:template match="*[CCC[not(.='B')]]" />

    <!--By default, copy all content forward -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
like image 29
Mads Hansen Avatar answered Sep 26 '22 02:09

Mads Hansen