Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I wrap a group of adjacent elements using XSLT?

Tags:

xml

xslt

I have some XML with <ListItem> elements, and I'd like to wrap any consecutive runs with <List> elements. So, source XML would look something like this:

<Section>
  <Head>Heading</Head>
  <Para>Blah</Para>
  <ListItem>item 1</ListItem>
  <ListItem>item 2</ListItem>
  <ListItem>item 3</ListItem>
  <ListItem>item 4</ListItem>
  <Para>Something else</Para>
</Section>

And I'd want to convert it to something like this:

<Section>
  <Head>Heading</Head>
  <Para>Blah</Para>
  <List>
    <ListItem>item 1</ListItem>
    <ListItem>item 2</ListItem>
    <ListItem>item 3</ListItem>
    <ListItem>item 4</ListItem>
  </List>
  <Para>Something else</Para>
</Section>

using XSLT. I'm sure it's obvious but I can't work it out at this time in the evening. Thanks!


Edit: this can be safely ignored by most people.

This XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root>
  <Story>
    <Section id="preface">
      <ChapterTitle>Redacted</ChapterTitle>
      <HeadA>Redacted</HeadA>
      <Body>Redacted</Body>
      <BulletListItem>Item1</BulletListItem>
      <BulletListItem>Item2</BulletListItem>
      <BulletListItem>Item3</BulletListItem>
      <BulletListItem>Item4</BulletListItem>
      <HeadA>Redacted</HeadA>
      <Body>Redacted</Body>
      <HeadA>Redacted</HeadA>
      <Body>Redacted</Body>
      <Body>Redacted<Italic>REDACTED</Italic>Redacted</Body>
      <BulletListItem>Second list Item1</BulletListItem>
      <BulletListItem>Second list Item2</BulletListItem>
      <BulletListItem>Second list Item3</BulletListItem>
      <BulletListItem>Second list Item4</BulletListItem>
      <Body>Redacted</Body>
    </Section>
  </Story>
</Root>

With this XSL:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kFollowing" match="BulletListItem[preceding-sibling::*[1][self::BulletListItem]]"
  use="generate-id(preceding-sibling::BulletListItem
         [not(preceding-sibling::*[1][self::BulletListItem])])"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="BulletListItem
         [not(preceding-sibling::*[1][self::BulletListItem])]">
  <BulletList>
    <xsl:call-template name="identity"/>
    <xsl:apply-templates mode="copy" select="key('kFollowing', generate-id())"/>
  </BulletList>
 </xsl:template>

 <xsl:template match="BulletListItem[preceding-sibling::*[1][self::BulletListItem]]"/>

 <xsl:template match="BulletListItem" mode="copy">
  <xsl:call-template name="identity"/>
 </xsl:template>
</xsl:stylesheet>

When processed with Ruby REXML and XML/XSLT produces this XML (output prettyprint):

<Root>
  <Story>
    <Section id='preface'>
      <ChapterTitle>
        Redacted
      </ChapterTitle>
      <HeadA>
        Redacted
      </HeadA>
      <Body>
        Redacted
      </Body>
      <BulletList>
        <BulletListItem>
          Item1
        </BulletListItem>
        <BulletListItem>
          Item2
        </BulletListItem>
        <BulletListItem>
          Item3
        </BulletListItem>
        <BulletListItem>
          Item4
        </BulletListItem>
        <BulletListItem>
          Second list Item2
        </BulletListItem>
        <BulletListItem>
          Second list Item3
        </BulletListItem>
        <BulletListItem>
          Second list Item4
        </BulletListItem>
      </BulletList>
      <HeadA>
        Redacted
      </HeadA>
      <Body>
        Redacted
      </Body>
      <HeadA>
        Redacted
      </HeadA>
      <Body>
        Redacted
      </Body>
      <Body>
        Redacted
        <Italic>
          REDACTED
        </Italic>
        Redacted
      </Body>
      <BulletList>
        <BulletListItem>
          Second list Item1
        </BulletListItem>
      </BulletList>
      <Body>
        Redacted
      </Body>
    </Section>
  </Story>
</Root>

You'll see that the two lists get jammed together and the bit in between gets lost. Not sure if this is a bug in the Ruby libraries or in your XSLT.

like image 446
Skilldrick Avatar asked Oct 18 '10 18:10

Skilldrick


1 Answers

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kFollowing" match="ListItem[preceding-sibling::*[1][self::ListItem]]"
  use="generate-id(preceding-sibling::ListItem
         [not(preceding-sibling::*[1][self::ListItem])][1])"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="ListItem
         [not(preceding-sibling::*[1][self::ListItem])]">
  <List>
    <xsl:call-template name="identity"/>
    <xsl:apply-templates mode="copy" select="key('kFollowing', generate-id())"/>
  </List>
 </xsl:template>

 <xsl:template match="ListItem[preceding-sibling::*[1][self::ListItem]]"/>

 <xsl:template match="ListItem" mode="copy">
  <xsl:call-template name="identity"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<Section>
  <Head>Heading</Head>
  <Para>Blah</Para>
  <ListItem>item 1</ListItem>
  <ListItem>item 2</ListItem>
  <ListItem>item 3</ListItem>
  <ListItem>item 4</ListItem>
  <Para>Something else</Para>
</Section>

produces the wanted result:

<Section>
    <Head>Heading</Head>
    <Para>Blah</Para>
    <List>
        <ListItem>item 1</ListItem>
        <ListItem>item 2</ListItem>
        <ListItem>item 3</ListItem>
        <ListItem>item 4</ListItem>
    </List>
    <Para>Something else</Para>
</Section>
like image 53
Dimitre Novatchev Avatar answered Nov 05 '22 12:11

Dimitre Novatchev