Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HtmlAgilityPack replace node

I want to replace a node with a new node. How can I get the exact position of the node and do a complete replace?

I've tried the following, but I can't figured out how to get the index of the node or which parent node to call ReplaceChild() on.

string html = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
HtmlDocument document = new HtmlDocument();
document.LoadHtml(html);

var bolds = document.DocumentNode.Descendants().Where(item => item.Name == "b");

foreach (var item in bolds)
{

    string newNodeHtml = GenerateNewNodeHtml();
    HtmlNode newNode = new HtmlNode(HtmlNodeType.Text, document, ?);
    item.ParentNode.ReplaceChild( )
}
like image 898
Omar Avatar asked Jul 21 '11 20:07

Omar


1 Answers

To create a new node, use the HtmlNode.CreateNode() factory method, do not use the constructor directly.

This code should work out for you:

var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);

var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
    var newNodeStr = "<foo>bar</foo>";
    var newNode = HtmlNode.CreateNode(newNodeStr);
    item.ParentNode.ReplaceChild(newNode, item);
}

Note that we need to call ToList() on the query, we will be modifying the document so it would fail if we don't.


If you wish to replace with this string:

"some text <b>node</b> <strong>another node</strong>"

The problem is that it is no longer a single node but a series of nodes. You can parse it fine using HtmlNode.CreateNode() but in the end, you're only referencing the first node of the sequence. You would need to replace using the parent node.

var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);

var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
    var newNodesStr = "some text <b>node</b> <strong>another node</strong>";
    var newHeadNode = HtmlNode.CreateNode(newNodesStr);
    item.ParentNode.ReplaceChild(newHeadNode.ParentNode, item);
}
like image 178
Jeff Mercado Avatar answered Oct 15 '22 13:10

Jeff Mercado