Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find and remove specified HTML tags using Html Agility Pack

I'm trying to get Html Agility Pack to work in my case. I need to detect all script elements in an existing HTML page and remove them, saving the changes to another file. Here, bodyNode returns the correct number of script tags, but I can't remove them. The new file still has those tags.

if (doc.DocumentNode != null)         
{
     var bodyNode = doc.DocumentNode.SelectNodes("//script");          
     if (bodyNode != null)             
     {
          bodyNode.Clear(); // clears the collection only                    
     } 

     doc.Save("some file");        
 }
like image 558
user246392 Avatar asked Jun 13 '11 16:06

user246392


1 Answers

You need to do something like this:

foreach(HtmlNode node in bodyNode)
{
   node.ParentNode.RemoveChild(node);
}
like image 197
Simon Mourier Avatar answered Nov 25 '22 05:11

Simon Mourier