Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the innertext alone without the child tags using HtmlAgilityPack?

I have an HTML page like below. I need to take the 'blah blah blah' alone from the 'span' tag.

<span class="news">
blah blah blah
<div>hello</div>
<div>bye</div> 
</span>

This gives me all values:

div.SelectSingleNode(".//span[@class='news']").InnerText.Trim();

This gives me null:

div.SelectSingleNode(".//span[@class='news']/preceding-sibling::text()").InnerText.Trim();

How do I get the text before the 'div' tag using HtmlAgilityPack?

like image 852
good-to-know Avatar asked Oct 18 '14 10:10

good-to-know


1 Answers

Your 2nd try was pretty close. Use /text() instead of /preceding-sibling::text(), because the text node is child of the span[@class='news'] not sibling (neither preceding nor following) :

div.SelectSingleNode(".//span[@class='news']/text()")
   .InnerText
   .Trim();
like image 61
har07 Avatar answered Oct 14 '22 16:10

har07