Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I select an XML node with the longest child #text node value with XPath?

I've used XPath to select the node with the largest integer id value before using this query:

//somenode[not(@id <= preceding::somenode/@id) and not(@id <= following::somenode/@id)]

I was hoping that I could do something similar like:

//entry[not(string-length(child::text()) <= string-length(preceding::entry/child::text())) and not(string-length(child::text()) <= string-length(following::entry/child::text()))]

But it returns a bunch of nodes instead of just one.

Sample XML:

<xml>
  <entry>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</entry>
  <entry>Nam dignissim mi a massa mattis rutrum eu eget mauris.</entry>
  <entry>Ut at diam a sem scelerisque pretium nec pulvinar purus.</entry>
  <entry>Nunc in nisi nec dolor accumsan suscipit vel a quam.</entry>
  <entry>Nunc suscipit lobortis arcu, nec adipiscing libero bibendum nec.</entry>
  <entry>Aenean eget ipsum et nunc eleifend scelerisque.</entry>
  <entry>In eu magna et diam volutpat molestie.</entry>
  <entry>In volutpat luctus mi, eu laoreet orci dictum vel.</entry>
  <entry>In mattis mi nec magna sodales eu bibendum felis aliquet.</entry>
<!-- etc for 800 more lines or so -->
  <entry>Duis auctor felis id neque gravida ut auctor ipsum ullamcorper.</entry>
  <entry>Sed vel tortor mauris, et aliquet tellus.</entry>
</xml>

XPath test: http://chris.photobooks.com/xml/default.htm?state=1o

like image 323
travis Avatar asked Nov 04 '22 12:11

travis


1 Answers

The wanted element(s) cannot be selected with a single XPath 1.0 expression, because in XPath 1.0 it is not possible to apply a function to all selected nodes (string-length(someNodeSet) is applied only on the first node of this node-set). Another reason is that in XPath 1.0 it isn't possible to name and reference range variables.

In XPath 2.0 this is trivial:

/*/entry[not(string-length(.) &lt; /*/entry/string-length(.))]

The above selects all entry elements the length of whose string value is the maximal one.

/*/entry[not(string-length(.) &lt; /*/entry/string-length(.))] [1]

The above selects the first (in document order) such entry element.

XSLT 2.0 - based verification:

This transformation:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:template match="/">
  <xsl:sequence select=
   "/*/entry[not(string-length(.) &lt; /*/entry/string-length(.))]"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<xml>
  <entry>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</entry>
  <entry>Nam dignissim mi a massa mattis rutrum eu eget mauris.</entry>
  <entry>Ut at diam a sem scelerisque pretium nec pulvinar purus.</entry>
  <entry>Nunc in nisi nec dolor accumsan suscipit vel a quam.</entry>
  <entry>Nunc suscipit lobortis arcu, nec adipiscing libero bibendum nec.</entry>
  <entry>Aenean eget ipsum et nunc eleifend scelerisque.</entry>
  <entry>In eu magna et diam volutpat molestie.</entry>
  <entry>In volutpat luctus mi, eu laoreet orci dictum vel.</entry>
  <entry>In mattis mi nec magna sodales eu bibendum felis aliquet.</entry>
<!-- etc for 800 more lines or so -->
  <entry>Duis auctor felis id neque gravida ut auctor ipsum ullamcorper.</entry>
  <entry>Sed vel tortor mauris, et aliquet tellus.</entry>
</xml>

selects the entry elements (in this case only one) with the maximum string-length and outputs the selected elements:

<entry>Nunc suscipit lobortis arcu, nec adipiscing libero bibendum nec.</entry>
like image 88
Dimitre Novatchev Avatar answered Nov 09 '22 08:11

Dimitre Novatchev