XPath on nested elements with text() but no whitespace

Question

I have poor XHTML I need to parse with xpath. It looks like this:

<div class="foo">
  i need this text
  <br/>
  <br/>
  <span>sometext</span>
</div>

<div class="foo">
  <span>some other text</span>
  <span>sometext</span>
</div>

I want to select ALL content with "i need this text" in the first div. My problem is, that the div elements contain whitespaces or other stuff, so that //div[@class="foo"]/text() is returning empty strings for the second div also. I want to ignore these empty fields, how can I do that?

Dimitre Novatchev · Accepted Answer

Use:

//div
   [.//text()
        [normalize-space() = 'i need this text']
   ]
    //text()[normalize-space()]

This selects any non-whitespace-only text node descendant of any div in the document, that (the div) has a text-node descendant whose normalized string value is the string "i need this text".

The normalize-space() function takes a string (the string value of the context node -- if no argument is specified) and produces from it another string in which all leading and trailing whitespace characters are deleted, and any inner group of adjacent whitespace characters is replaced by a single space.

XPath on nested elements with text() but no whitespace

Tags:

xml

xhtml

xpath

Jay

1 Answers

Dimitre Novatchev

Recent Activity

Donate For Us

XPath on nested elements with text() but no whitespace

Tags:

xml

xhtml

xpath

Jay

1 Answers

Dimitre Novatchev

Related questions

Recent Activity

Donate For Us