Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

count the number of words in a xml node using xsl

Tags:

xml

xslt

Here is the sample xml document.

<root>
  <node> count the number of words </node>
</root>

For this example I want to count the number of words in the node "" in xslt.

The output like be Number of words:: 5

Any idea for this?

Your (Dimitre Novatchev) code is working fine for the above xml. Is your code will work for the following xml?

<root>

<test>
   <node> pass pass </node>
</test>

  <test>
      <node> fail pass fail </node>
  </test>

  <test>
      <node> pass pass fail </node>
  </test>

 </root>

output like be: total number of words in the node "node": 8

Update3::

This code perfectly working for the above xml doc. Suppose

<root>
<test>
   <node> pass pass </node>
   <a> value </a>
   <b> value </b>
</test>

  <test>
      <node> fail fail </node>
      <b> value </b>
  </test>

  <test>
      <node> pass pass</node>
      <a> value </a>
  </test>
 </root>

But yours code count the number of words in the entire document. I want to count the number of words in the node type "node" only. The output like

Number of words in "node" :: 6 Total Pass:: 4 Total Fail:: 2

Thanx Sathish

like image 624
Sathish Avatar asked May 31 '11 13:05

Sathish


2 Answers

Use this XPath one-liner:

  string-length(normalize-space(node)) 
- 
  string-length(translate(normalize-space(node),' ','')) +1

Here is a short verification using XSLT:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="/*">
  <xsl:value-of select=
   " string-length(normalize-space(node))
    -
     string-length(translate(normalize-space(node),' ','')) +1"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<root>
    <node> count the number of words </node>
</root>

the wanted, correct result is produced:

5

Explanation: Use of the standard XPath functions normalize-space(), translate() and string-length() .

Update1:

The OP asked:

"Your (Dimitre Novatchev) code is working fine for the above xml. Is your code will work for the following xml?"

<root>
  <test>
      <node> pass pass </node>
  </test>
  <test>
      <node> fail pass fail </node>
  </test>
  <test>
      <node> pass pass fail </node>
  </test>
</root>

Answer: The same approach can be used:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="/">
        <xsl:value-of select=
        "string-length(normalize-space(.))
        -
         string-length(translate(normalize-space(.),' ','')) +1
         "/>
    </xsl:template>
</xsl:stylesheet>

When this transformation is used on the newly-provided XML document (above), the wanted correct answer is produced:

8

Update2: The OP then asked in a comment:

"Can I have a comparision with the words in the node with some default word. Conside node contains value "pass pass fail". I want to calculate number of pass and number of fail. LIke pass=2 fail=1. is it possible? Help me man"

Answer:

The same approach works with this modification of the problem, too (in the general case, though. you need a good tokenization -- ask me about this in a new question, please):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="node">
        pass: <xsl:value-of select=
                "string-length()
                -
                 string-length(translate(.,'p',''))
         "/>
<xsl:text/>     fail: <xsl:value-of select=
                "string-length()
                -
                 string-length(translate(.,'f',''))
         "/>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the last XML document (above), the wanted, correct is produced:

    pass: 2     fail: 0
    pass: 1     fail: 2
    pass: 2     fail: 1
like image 181
Dimitre Novatchev Avatar answered Oct 20 '22 06:10

Dimitre Novatchev


in xslt i think you would need to process to remove any double spacing and then count the remaining spaces to find an answer. although im sure there are better ways!

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>

<xsl:template match="root">
        <xsl:for-each select="node">
                <xsl:call-template name="word-count">
                        <xsl:with-param name="data" select="normalize-space(.)"/>
                        <xsl:with-param name="num" select="1"/>
                </xsl:call-template>
        </xsl:for-each>
</xsl:template>

    <xsl:template name="word-count">
            <xsl:param name="data"/>
            <xsl:param name="num"/>
            <xsl:variable name="newdata" select="$data"/>
            <xsl:variable name="remaining" select="substring-after($newdata,' ')"/>                

            <xsl:choose>
                    <xsl:when test="$remaining">
                            <xsl:call-template name="word-count">
                                    <xsl:with-param name="data" select="$remaining"/>
                                    <xsl:with-param name="num" select="$num+1"/>
                            </xsl:call-template>
                    </xsl:when>
                    <xsl:when test="$num = 1">
                            no words...
                    </xsl:when>
                    <xsl:otherwise>
                            <xsl:value-of select="$num"/>
                    </xsl:otherwise>
            </xsl:choose>                
    </xsl:template>

</xsl:stylesheet>

this example code works, ammended it from a stylesheet i had which was processing some legacy code into usefull html output!

updated code to improve against errors, catches duplicate whitespace and also catches empty nodes :>

Updated to solve additional problem!

 <xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="html"/>

        <xsl:template match="root">
        <xsl:for-each select="test/node">
                <xsl:call-template name="word-count">
                        <xsl:with-param name="data" select="normalize-space(.)"/>
                        <xsl:with-param name="num" select="1"/>
                        <xsl:with-param name="pass" select="0"/>
                        <xsl:with-param name="fail" select="0"/>
                </xsl:call-template>
        </xsl:for-each>
</xsl:template>

<xsl:template name="word-count">
        <xsl:param name="data"/>
        <xsl:param name="num"/>
        <xsl:param name="fail"/>
        <xsl:param name="pass"/>
        <xsl:variable name="newdata" select="$data"/>
        <xsl:variable name="first">
                <xsl:choose>
                        <xsl:when test="substring-before($newdata,' ')">
                                <xsl:value-of select="substring-before($newdata,' ')"/>  
                        </xsl:when>
                        <xsl:otherwise>
                                <xsl:value-of select="$newdata"/>
                        </xsl:otherwise>
                </xsl:choose>
        </xsl:variable> 
        <xsl:variable name="remaining" select="substring-after($newdata,' ')"/>
        <xsl:variable name="newpass">
                <xsl:choose>
                        <xsl:when test="$first='pass'">
                                <xsl:value-of select="$pass+1"/>
                        </xsl:when>
                        <xsl:otherwise>
                                <xsl:value-of select="$pass"/>
                        </xsl:otherwise>
                </xsl:choose>
        </xsl:variable>        
        <xsl:variable name="newfail">
                <xsl:choose>
                        <xsl:when test="$first='fail'">
                                <xsl:value-of select="$fail+1"/>
                        </xsl:when>
                        <xsl:otherwise>
                                <xsl:value-of select="$fail"/>
                        </xsl:otherwise>
                </xsl:choose>
        </xsl:variable>

        <xsl:choose>
                <xsl:when test="$remaining">
                        <xsl:call-template name="word-count">                        
                                <xsl:with-param name="data" select="$remaining"/>
                                <xsl:with-param name="num" select="$num+1"/>
                                <xsl:with-param name="pass" select="$newpass"/>
                                <xsl:with-param name="fail" select="$newfail"/>
                        </xsl:call-template>
                </xsl:when>
                <xsl:when test="$num = 1">
                        it was empty
                </xsl:when>
                <xsl:otherwise>
                        <xsl:value-of select="$first"/>
                        wordcount:<xsl:value-of select="$num"/>
                        pass:<xsl:value-of select="$newpass"/>
                        fail:<xsl:value-of select="$newfail"/><br/>
                </xsl:otherwise>
        </xsl:choose>
</xsl:template>
</xsl:stylesheet>
like image 23
Treemonkey Avatar answered Oct 20 '22 06:10

Treemonkey