I am trying to get all the innerText from all the htmlDocument node from any html document.
I been going doing some research but haven't found a solution to how I will be able to go through all the parent and child node in the entire document without have to specify the node name.
I want to do this because I will be working with different html document so specifying the node name will not be an option for me at this point.
I figured it out now... omg it was so simple to begin with as i didnt know the how to use these function
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.Load(MyIO.bingPathToAppDir("Test data/testHTML.html"));
HtmlNode j = htmlDoc.DocumentNode;
foreach (HtmlNode node in j.ChildNodes)
{
checkNode(node);
}
static void checkNode(HtmlNode node)
{
foreach (HtmlNode n in node.ChildNodes)
{
if (n.HasChildNodes)
{
checkNode(n);
}
else
{
Console.WriteLine(n.InnerText);
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With