Logo Questions Linux Laravel Mysql Ubuntu Git Menu

count the number of words in a xml node using xsl




Here is the sample xml document.

  <node> count the number of words </node>

For this example I want to count the number of words in the node "" in xslt.

The output like be Number of words:: 5

Any idea for this?

Your (Dimitre Novatchev) code is working fine for the above xml. Is your code will work for the following xml?


   <node> pass pass </node>

      <node> fail pass fail </node>

      <node> pass pass fail </node>


output like be: total number of words in the node "node": 8


This code perfectly working for the above xml doc. Suppose

   <node> pass pass </node>
   <a> value </a>
   <b> value </b>

      <node> fail fail </node>
      <b> value </b>

      <node> pass pass</node>
      <a> value </a>

But yours code count the number of words in the entire document. I want to count the number of words in the node type "node" only. The output like

Number of words in "node" :: 6 Total Pass:: 4 Total Fail:: 2

Thanx Sathish

like image 624
Sathish Avatar asked May 31 '11 13:05


2 Answers

Use this XPath one-liner:

  string-length(translate(normalize-space(node),' ','')) +1

Here is a short verification using XSLT:

<xsl:stylesheet version="1.0"
 <xsl:output method="text"/>

 <xsl:template match="/*">
  <xsl:value-of select=
   " string-length(normalize-space(node))
     string-length(translate(normalize-space(node),' ','')) +1"/>

When this transformation is applied on the provided XML document:

    <node> count the number of words </node>

the wanted, correct result is produced:


Explanation: Use of the standard XPath functions normalize-space(), translate() and string-length() .


The OP asked:

"Your (Dimitre Novatchev) code is working fine for the above xml. Is your code will work for the following xml?"

      <node> pass pass </node>
      <node> fail pass fail </node>
      <node> pass pass fail </node>

Answer: The same approach can be used:

<xsl:stylesheet version="1.0"
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="/">
        <xsl:value-of select=
         string-length(translate(normalize-space(.),' ','')) +1

When this transformation is used on the newly-provided XML document (above), the wanted correct answer is produced:


Update2: The OP then asked in a comment:

"Can I have a comparision with the words in the node with some default word. Conside node contains value "pass pass fail". I want to calculate number of pass and number of fail. LIke pass=2 fail=1. is it possible? Help me man"


The same approach works with this modification of the problem, too (in the general case, though. you need a good tokenization -- ask me about this in a new question, please):

<xsl:stylesheet version="1.0"
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="node">
        pass: <xsl:value-of select=
<xsl:text/>     fail: <xsl:value-of select=

When this transformation is applied on the last XML document (above), the wanted, correct is produced:

    pass: 2     fail: 0
    pass: 1     fail: 2
    pass: 2     fail: 1
like image 181
Dimitre Novatchev Avatar answered Oct 20 '22 06:10

Dimitre Novatchev

in xslt i think you would need to process to remove any double spacing and then count the remaining spaces to find an answer. although im sure there are better ways!

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>

<xsl:template match="root">
        <xsl:for-each select="node">
                <xsl:call-template name="word-count">
                        <xsl:with-param name="data" select="normalize-space(.)"/>
                        <xsl:with-param name="num" select="1"/>

    <xsl:template name="word-count">
            <xsl:param name="data"/>
            <xsl:param name="num"/>
            <xsl:variable name="newdata" select="$data"/>
            <xsl:variable name="remaining" select="substring-after($newdata,' ')"/>                

                    <xsl:when test="$remaining">
                            <xsl:call-template name="word-count">
                                    <xsl:with-param name="data" select="$remaining"/>
                                    <xsl:with-param name="num" select="$num+1"/>
                    <xsl:when test="$num = 1">
                            no words...
                            <xsl:value-of select="$num"/>


this example code works, ammended it from a stylesheet i had which was processing some legacy code into usefull html output!

updated code to improve against errors, catches duplicate whitespace and also catches empty nodes :>

Updated to solve additional problem!

 <xsl:stylesheet version="1.0"
        <xsl:output method="html"/>

        <xsl:template match="root">
        <xsl:for-each select="test/node">
                <xsl:call-template name="word-count">
                        <xsl:with-param name="data" select="normalize-space(.)"/>
                        <xsl:with-param name="num" select="1"/>
                        <xsl:with-param name="pass" select="0"/>
                        <xsl:with-param name="fail" select="0"/>

<xsl:template name="word-count">
        <xsl:param name="data"/>
        <xsl:param name="num"/>
        <xsl:param name="fail"/>
        <xsl:param name="pass"/>
        <xsl:variable name="newdata" select="$data"/>
        <xsl:variable name="first">
                        <xsl:when test="substring-before($newdata,' ')">
                                <xsl:value-of select="substring-before($newdata,' ')"/>  
                                <xsl:value-of select="$newdata"/>
        <xsl:variable name="remaining" select="substring-after($newdata,' ')"/>
        <xsl:variable name="newpass">
                        <xsl:when test="$first='pass'">
                                <xsl:value-of select="$pass+1"/>
                                <xsl:value-of select="$pass"/>
        <xsl:variable name="newfail">
                        <xsl:when test="$first='fail'">
                                <xsl:value-of select="$fail+1"/>
                                <xsl:value-of select="$fail"/>

                <xsl:when test="$remaining">
                        <xsl:call-template name="word-count">                        
                                <xsl:with-param name="data" select="$remaining"/>
                                <xsl:with-param name="num" select="$num+1"/>
                                <xsl:with-param name="pass" select="$newpass"/>
                                <xsl:with-param name="fail" select="$newfail"/>
                <xsl:when test="$num = 1">
                        it was empty
                        <xsl:value-of select="$first"/>
                        wordcount:<xsl:value-of select="$num"/>
                        pass:<xsl:value-of select="$newpass"/>
                        fail:<xsl:value-of select="$newfail"/><br/>
like image 23
Treemonkey Avatar answered Oct 20 '22 06:10
