Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HtmlAgilityPack set node InnerText

I want to replace inner text of HTML tags with another text. I am using HtmlAgilityPack
I use this code to extract all texts

HtmlDocument doc = new HtmlDocument(); doc.Load("some path")  foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//text()[normalize-space(.) != '']")) {     // How to replace node.InnerText with some text ? } 

But InnerText is readonly. How can I replace texts with another text and save them to file ?

like image 859
Shahin Avatar asked Nov 25 '11 21:11

Shahin


2 Answers

Try code below. It select all nodes without children and filtered out script nodes. Maybe you need to add some additional filtering. In addition to your XPath expression this one also looking for leaf nodes and filter out text content of <script> tags.

var nodes = doc.DocumentNode.SelectNodes("//body//text()[(normalize-space(.) != '') and not(parent::script) and not(*)]"); foreach (HtmlNode htmlNode in nodes) {     htmlNode.ParentNode.ReplaceChild(HtmlTextNode.CreateNode(htmlNode.InnerText + "_translated"), htmlNode); } 
like image 197
Yuriy Rozhovetskiy Avatar answered Sep 23 '22 18:09

Yuriy Rozhovetskiy


Strange, but I found that InnerHtml isn't readonly. And when I tried to set it like that

aElement.InnerHtml = "sometext"; 

the value of InnerText also changed to "sometext"

like image 36
lena Avatar answered Sep 21 '22 18:09

lena