I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection
of links. For each link I want to check if there is an image node and then parse its attributes
but the SelectNodes
and SelectSingleNode
methods of linkNode
seems to be searching the parent document not the childNodes
of linkNode
. What gives?
HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(content);
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");
foreach(HtmlNode linkNode in linkNodes)
{
string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
if (linkTitle == string.Empty)
{
HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");
}
}
Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?
You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.
HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");
With an xpath query you can also use "." to indicate the search should start at the current node.
HtmlNode imageNode = linkNode.SelectSingleNode(".//img[@alt]");
Also, watch out for null
checks; SelectNodes
returns null
instead of blank collection.
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");
**if(linkNodes!=null)**
{
foreach(HtmlNode linkNode in linkNodes)
{
string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
if (linkTitle == string.Empty)
{
**HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");**
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With