Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HtmlAgilityPack selecting childNodes not as expected

I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection of links. For each link I want to check if there is an image node and then parse its attributes but the SelectNodes and SelectSingleNode methods of linkNode seems to be searching the parent document not the childNodes of linkNode. What gives?

HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(content);
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");
    
foreach(HtmlNode linkNode in linkNodes)
{
    string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
    if (linkTitle == string.Empty)
    {
        HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");     
    }
}

Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?

like image 986
Sheff Avatar asked May 13 '09 10:05

Sheff


3 Answers

You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.

HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");
like image 95
Richard Szalay Avatar answered Oct 20 '22 08:10

Richard Szalay


With an xpath query you can also use "." to indicate the search should start at the current node.

HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");
like image 27
ulty4life Avatar answered Oct 20 '22 08:10

ulty4life


Also, watch out for null checks; SelectNodes returns null instead of blank collection.

HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

**if(linkNodes!=null)**
{
   foreach(HtmlNode linkNode in linkNodes)
  {
     string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
     if (linkTitle == string.Empty)
     {
       **HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");**   
     }
  }
}
like image 44
msqr Avatar answered Oct 20 '22 08:10

msqr