Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to transform XML with XSLT, but the output line breaks

Tags:

Trying to transform XML with XSLT, but the output line breaks.

This is my code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">

 <xsl:template match="/" >
    <data>
     <content>
            <xsl:for-each select="//t[@pos='ADV' or @pos='ADJD' or @pos='ADJA' or @pos='NE' or 
                                      @pos='NN' or @pos='VMFIN' or @pos='VVINF' or @pos='VAFIN' or 
                                      @pos='VVPP' or @pos='VVFIN']">
                <xsl:sort select="@word"/>
                    <token>
                        <xsl:value-of select="@word"/>
                        (<xsl:value-of select="@lemma"/>;
                        <xsl:value-of select="@pos"/>;
                        <xsl:value-of select="@morph"/>)
                    </token>
            </xsl:for-each>    
    </content>
   </data>
 </xsl:template>    
</xsl:stylesheet> 

Im getting this as output:

<?xml version="1.0" encoding="utf-8"?><data><content><token>Aktivitäten
                        (Aktivität;
                        NN;
                        Acc.Pl.Fem)
                    </token><token>Bank
                        (Bank;
                        NN;
                        Dat.Sg.Fem)
                    </token><token>Behörden
                        (Behörde;
                        NN;
                        Dat.Pl.Fem)
                    </token>

Im trying to replicate this output result

<?xml version="1.0" encoding="UTF-8"?>
<data>
 <content>
   <token>Aktivitäten(Aktivität;NN;Acc.Pl.Fem)</token>
   <token>Bank(Bank;NN;Dat.Sg.Fem)</token>
   <token>Behörden(Behörde;NN;Dat.Pl.Fem)</token>
   etc...

I'm new to XSLT thanks for any assistance

like image 486
Sivert Ekse Avatar asked Apr 22 '18 01:04

Sivert Ekse


2 Answers

The normally insignificant white space is made significant when abutted with other literal text such as (, ;, and ).

If you wrap each of those strings (characters) in xsl:text,

                    <xsl:value-of select="@word"/>
                    <xsl:text>(</xsl:text>
                    <xsl:value-of select="@lemma"/>
                    <xsl:text>;<xsl:text>
                    ...

you'll get your desired XML output.


Another way [thanks to @Tomalak] to eliminate the undesired white space is to concatenate the string values in a single xsl:value-of:

<xsl:value-of select="concat(@word, '(', @lemma, ';', @pos, ';', @morph, ')')" />
like image 76
kjhughes Avatar answered Sep 28 '22 17:09

kjhughes


Let's start from the beginning of your output:

<?xml version="1.0" encoding="utf-8"?><data><content><token>

It is the result of default output indent attribute set to no. The reason of this setting is that if the output XML is not to be read by a human, its processing runs quicker when it contains no additional spaces and newlines.

The reason that your token output tag has "additional" newlines and spaces is that they are actually included in your script.

Look at the following fragment of your script:

<xsl:value-of select="@word"/>
(<xsl:value-of select="@lemma"/>;

After <xsl:value-of select="@word"/> your script contains a text token, containing:

  • a newline,
  • some spaces,
  • and finally ( - the only thing which actually should be printed.

To get the result you want, make 2 changes:

  • After xsl:stylesheet opening tag, add <xsl:output indent="yes"/>.
  • Change the content of token to a single xsl:value-of with select="concat(...)" with all stuff to print as arguments of concat function.

For a working example see http://xsltransform.net/aiwQ3T

like image 36
Valdi_Bo Avatar answered Sep 28 '22 17:09

Valdi_Bo