Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return nodes that have siblings that start-with substring of self

I'm trying to come up with a clean Xpath 1.0 expression in Excel's FILTERXML() function to return nodes with the following requirement:

  • The node must have a sibling (following) that starts with the exact same three characters of itself.

The point is to find out if there is some similarity to a certain degree in the data. Imagine the following sample data:

<t>
    <s>ABCDEF</s>
    <s>GHIJKL</s>
    <s>MNOPQR</s>
    <s>GHISTU</s>
    <s>ABVWXY</s>
</t>

From here, I'd like to return GHIJKL since it's first 3 characters, 'GHI', are found at the start of the second-to-last node.

I've been trying to piece together functions like starts-with(), substring() and count(), yet not been able to get it right. My (obviously wrong) attempt:

//s[count(following::*[starts-with(., substring(<placeholder>, 1,3))]>0]

I'm unsure if it's possible at all and as to what to write instead of the placeholder or how to rework the query to tell the expression I'd like to take the three leftmost characters of each node and test if there are any duplicates in the following ones.

like image 903
JvdV Avatar asked Sep 01 '25 01:09

JvdV


1 Answers

Would following expression work?

//s[substring(., 1, 3) = following::*/substring(., 1, 3)]
like image 71
Alexandra Dudkina Avatar answered Sep 03 '25 19:09

Alexandra Dudkina