Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does normalize-space() not strip all spaces?

I wrote a little XSLT where I added normalize-space() function to strip unnecessary spaces:

http://xsltransform.net/bnnZWM

<xsl:template match="page/pageFunctionResult/*/text()">
   <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

The XSLT itself works, except that some spaces are not normalized:

<category> TEST </category>

I don't understand why normalize-space() can't remove these spaces.

like image 812
Adrian Avatar asked Dec 19 '22 19:12

Adrian


2 Answers

As noted in comments, the characters are really NON-BREAKING SPACE characters (#160). To handle them as regular spaces, use:

<xsl:value-of select="normalize-space(translate(., '&#160;', ' '))"/>
like image 59
michael.hor257k Avatar answered Feb 07 '23 03:02

michael.hor257k


The normalize-space() function strips whitespace:

[3]       S      ::=      (#x20 | #x9 | #xD | #xA)+

The characters surrounding TEXT in your linked example are not one of these characters (as @har07 points out in the comments). Per @michael.hor257k's clever use of string-to-codepoints(),

<xsl:template match="page/pageFunctionResult[1]/category[1]">
  <xsl:value-of select="string-to-codepoints(substring(., 1, 1))"/>
</xsl:template>

we can see that they are NO-BREAK SPACE characters (#xA0), aka &nbsp;.

To remove &nbsp;, you'll need something more than normalize-space()....

XSLT 1.0

See @michael.hor257k's answer. (+1)

XSLT 2.0

If you want to cover &nbsp; along with other types of whitespace characters, use replace() with a category escape ahead of normalize-space():

<xsl:value-of select="normalize-space(replace(., '\p{Z}', ' '))"/>
like image 37
kjhughes Avatar answered Feb 07 '23 03:02

kjhughes