I would like to remove tags which contain only whitespace/newline/tab chars, as below:
<p>    </p>
How would you do this using xpath functions and xslt templates?
This transformation (overriding the identity rule):
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>
 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>
 <xsl:template match="*[not(*) and not(text()[normalize-space()])]"/>
</xsl:stylesheet>
when applied to the following XML document:
<t>
 <a>
  <b>
    <c/>
  </b>
 </a>
 <p></p>
 <p>  </p>
 <p>Text</p>
</t>
correctly produces the wanted result:
<t>
   <a>
      <b/>
   </a>
   <p>Text</p>
</t>
Remember: Using and overriding the identity rule/template is the most fundamental and powerful XSLT design pattern. It is the right choice for a variety of problems where most of the nodes are to be copied unchanged and only some specific nodes need be altered, deleted, renamed, ..., etc.
Note: @Abel in his comment recommends that some bits of this solution need to be further explained:
For the uninitiated or curious:
not(*)means: not having an child element;not(text()[normalize-space()])means: not having a text-node with non - white-space-only text.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With