I'm using the html agility pack to read the contents of my html document into a string etc. After this is done, I would like to remove certian elements in that content by their class, however I am stumbling upon a problem.
My Html looks like this:
<div id="wrapper">
<div class="maincolumn" >
<div class="breadCrumbContainer">
<div class="breadCrumbs">
</div>
</div>
<div class="seo_list">
<div class="seo_head">Header</div>
</div>
Content goes here...
</div>
Now, I have used an xpath selector to get all the content within the and used the InnerHtml property like so:
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
if (node != null)
{
pageContent = node.InnerHtml;
}
From this point, I would like to remove the div with the class of "breadCrumbContainer", however when using the code below, I get the error: "Node "" was not found in the collection"
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
node = node.RemoveChild(node.SelectSingleNode("//div[@class='breadCrumbContainer']"));
if (node != null)
{
pageContent = node.InnerHtml;
}
Can anyone shed some light on this please? I'm quite new to Xpath, and really new to the HtmlAgility library.
Thanks,
Dave
It's because RemoveChild can only remove a direct child, not a grand child. Try this instead:
HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='breadCrumbContainer']");
node.ParentNode.RemoveChild(node);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With