Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath select everything until first <br />

Tags:

xml

xpath

I have this XML:

<test>
  <a>
    abc
    <xyz />
    def
    <br />
    ghi jkl
    <br />
    mno pqr
  </a>
  <a>
    xxx
  </a>
</test>

In every <a> I need to select all subnodes (even textual) until first <br />. If there is no <br />, I need to select all the subnodes in given <a>. So the expected result is:

abc
<xyz />
def

and

xxx

I am able to pick "everything before <br />" like this:

//a/*[following::br]

... but it selects everything before the last <br />. I need to select everything before the first one. I tried it like this:

//a/*[following::br[1]]

... with the same result as before. These also don't select the nodes in "non-<br />" <a>s.

How can I do this using (prefferably single) XPath expression? Thanks for any suggestions.

like image 754
Roman Hocke Avatar asked Mar 16 '23 07:03

Roman Hocke


1 Answers

You can try this way (formatted for readability) :

//a/node()[
    following-sibling::br[not(preceding-sibling::br)] 
        or 
    not(../br)
]

brief explanation about the predicate expressions (content of []) being used :

  • following-sibling::br[not(preceding-sibling::br)] : evaluates to true when the corresponding node located before the first <br/> -expressed as " <br> that doesn't have preceding sibling <br/>"-

  • not(../br) : evaluates to true when the parent <a> doesn't have any child <br/>

like image 176
har07 Avatar answered Mar 30 '23 18:03

har07