Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Processing instructions transform

Tags:

xslt

I want to transform processing instructions in a source xml to some tag in an output

Input

<?xml version="1.0" encoding="utf-8"?>
<root>
    <?PI_start?> SOME TEXT <?PI_end?>
</root>

I want to have the output xml like that

<root>
    <tag> SOME TEXT </tag>
</root>

Can I do it? If yes what xsl must I use for transform?

I found only a way to transform PIs to the opening and closing tags. PI can contain some content.

Input XML

<root>
    <?PI SOME TEXT?>
</root>

XSL

<xsl:template match="processing-instruction('PI')">
    <tag><xsl:value-of select="."/></tag>
</xsl:template>

Output

<tag>SOME TEXT</tag>

But this is a bit not my case

like image 295
Nawa Avatar asked Nov 12 '10 13:11

Nawa


1 Answers

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="processing-instruction('PI_start')">
  <tag>
   <xsl:apply-templates mode="copy" select=
       "following-sibling::node()[1][self::text()]"/>
  </tag>
 </xsl:template>

 <xsl:template match=
 "processing-instruction('PI_end')
 |
  text()[preceding-sibling::node()[1]
              [self::processing-instruction('PI_start')]]
 "/>
</xsl:stylesheet>

when applied on the provided XML document:

<?xml version="1.0" encoding="utf-8"?>
<root>
    <?PI_start?> SOME TEXT <?PI_end?>
</root>

produces the wanted, correct result:

<root>
   <tag> SOME TEXT </tag>
</root>

Do note:

  1. The identity rule is used to copy all nodes "as-is".

  2. We have additional templates only for nodes that should be changed in some way.

  3. The template matching the first PI "does almost all the work". It creates a tag element and applies templates to the following-sibling node if it is a PI.

  4. We apply templates in mode "copy" for the text node immediate sibling of the first PI.

  5. The mode "copy" isn't declared anywhere and this causes the default template for processing text nodes to be selected -- its action is to just copy the text node. This is a trick that saves us from the need to define a template in the "copy" mode.

  6. We have an empty template that actually deletes the unwanted nodes: the second PI and what would be a second copy of the first PI's immediate-sibling text node.

Update: The OP has indicated that he is also interested in the case where in-between the two PIs there might be different nodes (not only text nodes).

This is a lot more complex task and here is one solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kSurrounded" match="node()"
  use="concat(
        generate-id(preceding-sibling::processing-instruction('PI_start')[1]),
        '+++',
        generate-id(following-sibling::processing-instruction('PI_end')[1])
             )"/>

 <xsl:template match="node()|@*" name="identity">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="processing-instruction('PI_start')">
  <tag>
   <xsl:apply-templates mode="copy" select=
       "key('kSurrounded',
             concat(generate-id(),
                   '+++',
                   generate-id(following-sibling::processing-instruction('PI_end')[1])
                   )
             )"/>
  </tag>
 </xsl:template>

 <xsl:template match=
 "processing-instruction('PI_end')
 |
  node()[(preceding-sibling::processing-instruction('PI_start')
         |
          preceding-sibling::processing-instruction('PI_end')
          )
           [last()][self::processing-instruction('PI_start')]
        and
         (following-sibling::processing-instruction('PI_start')
        |
          following-sibling::processing-instruction('PI_end')
          )
           [1][self::processing-instruction('PI_end')]
        ]
 "/>

 <xsl:template match="node()" mode="copy">
  <xsl:call-template name="identity"/>
 </xsl:template>
</xsl:stylesheet>

when the above transformation is applied on the following XML document:

<root>
    <?PI_start?> <strong>Some</strong> TEXT <?PI_end?> XA <?PI_end?>
</root>

the wanted, correct output is produced:

<root>
    <tag>
        <strong>Some</strong> TEXT 
    </tag> XA 
</root>
like image 151
Dimitre Novatchev Avatar answered Oct 26 '22 04:10

Dimitre Novatchev