Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Removing element by class name with HTMLAgilityPack c#

I'm using the html agility pack to read the contents of my html document into a string etc. After this is done, I would like to remove certian elements in that content by their class, however I am stumbling upon a problem.

My Html looks like this:

<div id="wrapper">
    <div class="maincolumn" >
        <div class="breadCrumbContainer">
            <div class="breadCrumbs">

        <div class="seo_list">
            <div class="seo_head">Header</div>

Content goes here...

Now, I have used an xpath selector to get all the content within the and used the InnerHtml property like so:

            node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
            if (node != null)
                pageContent = node.InnerHtml;

From this point, I would like to remove the div with the class of "breadCrumbContainer", however when using the code below, I get the error: "Node "" was not found in the collection"

            node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
            node = node.RemoveChild(node.SelectSingleNode("//div[@class='breadCrumbContainer']"));

            if (node != null)
                pageContent = node.InnerHtml;

Can anyone shed some light on this please? I'm quite new to Xpath, and really new to the HtmlAgility library.



like image 534
Dave Avatar asked Mar 07 '11 10:03


1 Answers

It's because RemoveChild can only remove a direct child, not a grand child. Try this instead:

    HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='breadCrumbContainer']");
like image 156
Simon Mourier Avatar answered Sep 23 '22 07:09

Simon Mourier