Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT output formatting: removing line breaks, and blank output lines from removed elements while keeping indent

Tags:

xml

xslt

Here is my XML:

<doc xmlns="http://www.foo.org">
  <div>
    <title>Mr. Title</title>
    <paragraph>This is one paragraph.
    </paragraph>
    <paragraph>Another paragraph.
    </paragraph>
    <list>
      <orderedlist>
        <item>
          <paragraph>An item paragraph.</paragraph>
        </item>
        <item>
          <paragraph>Another item paragraph</paragraph>
        </item>
      </orderedlist>
    </list>
  </div>    
</doc>

Here is my XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foo="http://www.foo.org">

<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="foo:doc">
  <xsl:element name="newdoc" namespace="http://www/w3.org/1999/xhtml">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:div">
  <segment title="{foo:title}">
   <xsl:apply-templates/>
  </segment>
 </xsl:template>

 <xsl:template match="foo:title">
  <xsl:element name="h2">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:paragraph">
  <xsl:element name="p">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:list">
  <xsl:apply-templates/>
 </xsl:template>

 <xsl:template match="foo:orderedlist">
  <xsl:element name="ol">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:item">
  <xsl:element name="li">
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="foo:item/foo:paragraph">
  <xsl:apply-templates/>
 </xsl:template>

</xsl:stylesheet>

And the output:

<newdoc xmlns="http://www/w3.org/1999/xhtml">
  <segment xmlns="" title="Mr. Title">
    <h2>Mr. Title</h2>
    <p>This is one paragraph.
    </p>
    <p>Another paragraph.
    </p>

      <ol>
        <li>
          An item paragraph.
        </li>

        <li>
          Another item paragraph
        </li>
      </ol>

  </segment>    
</newdoc>

I would like to change 3 things about this output:

  1. remove the line break from the "p" elements (originally paragraph)
  2. remove the line breaks from the "li" elements (produced when item/paragraph elements were removed)
  3. remove the extra blank lines created when the list items were removed

-I have tried <xsl:template match="foo:list/text()[normalize-space(.)='']" /> for #3, but this messes with the indentation

-I have also tried <xsl:template match="foo:paragraph/text()[normalize-space(.)='']" /> for #1, but this has no effect on the line breaks

-And I have tried <xsl:strip-space elements="*"/> but this eliminates all indentation

Thank you!!

like image 466
Zori Avatar asked Apr 20 '11 23:04

Zori


People also ask

What are the output formats for XSLT?

XSLT uses the <xsl:output> element to determine whether the output produced by the transformation is conformant XML (<xsl:output method="xml"/> ), valid HTML (<xsl:output method="html"/> ), or unverified text (< xsl:output method="text"/> ).

How do you indent in XSLT?

Indenting. Setting the xsl:output element's indent attribute to a value of "yes" tells the XSLT processor that it may add additional whitespace to the result tree. The default value is "no". Warning An indent value of "yes" means that an XSLT processor may add whitespace to the result.

What is normalize space in XSLT?

The normalize-space() function It does three things: It removes all leading spaces. It removes all trailing spaces. It replaces any group of consecutive whitespace characters with a single space.


2 Answers

Adding these templates to your stylesheet:

<xsl:template match="*/text()[normalize-space()]">
    <xsl:value-of select="normalize-space()"/>
</xsl:template>

<xsl:template match="*/text()[not(normalize-space())]" />

Produces this output:

<?xml version="1.0" encoding="UTF-8"?>
<newdoc xmlns="http://www/w3.org/1999/xhtml">
    <segment xmlns="" xmlns:foo="http://www.foo.org" title="Mr. Title">
        <h2>Mr. Title</h2>
        <p>This is one paragraph.</p>
        <p>Another paragraph.</p>
        <ol>
            <li>An item paragraph.</li>
            <li>Another item paragraph</li>
        </ol>
    </segment>
</newdoc>
like image 80
Mads Hansen Avatar answered Oct 05 '22 01:10

Mads Hansen


At the very end of the stylesheet add these two templates:

<xsl:template match=
"text()[not(string-length(normalize-space()))]"/>

<xsl:template match=
"text()[string-length(normalize-space()) > 0]">
  <xsl:value-of select="translate(.,'&#xA;&#xD;', '  ')"/>
</xsl:template>

You now get the wanted result:

<?xml version="1.0" encoding="UTF-8"?>
<newdoc xmlns="http://www/w3.org/1999/xhtml">
   <segment xmlns="" xmlns:foo="http://www.foo.org" title="Mr. Title">
      <h2>Mr. Title</h2>
      <p>This is one paragraph.         </p>
      <p>Another paragraph.         </p>
      <ol>
         <li>An item paragraph.</li>
         <li>Another item paragraph</li>
      </ol>
   </segment>
</newdoc>
like image 38
Dimitre Novatchev Avatar answered Oct 05 '22 03:10

Dimitre Novatchev