I have an ASP.NET page and some custom class that fetches a specified webpage and returns that page body back.
protected String GetHtml()
{
    Thread thread = new Thread(new ThreadStart(GetHtmlWorker));
    thread.SetApartmentState(ApartmentState.STA);
    thread.Start();
    thread.Join();
    return docHtml;
}
protected void GetHtmlWorker()
{
    using (WebBrowser browser = new WebBrowser())
    {
        browser.ScriptErrorsSuppressed = true;
        browser.Navigate(_url);
        // Wait for control to load page
        while (browser.ReadyState != WebBrowserReadyState.Complete)
            Application.DoEvents();
        docHtml = browser.DocumentText;
    }
}
But what I need is to get DOM HTML instead of the page source because I do some extra operations over DOM by jQuery.
Here is one solution I found to get to the rendered HTML(DOM) after javascript was run:
Place a WebBrowser control named webBrowser1 on the Form of class Form1.
[Form1.cs[Design]]

Then for code use:
[Form1.cs]
using System;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace WebBrowserTest
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
            this.webBrowser1.ObjectForScripting = new MyScript();
        }
        private void Form1_Load(object sender, EventArgs e)
        {
            webBrowser1.Navigate("http://localhost:6489/Default.aspx");
        }
        private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            webBrowser1.Navigate("javascript: window.external.CallServerSideCode();");
        }
        [ComVisible(true)]
        public class MyScript
        {
            public void CallServerSideCode()
            {
                var doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
            }
        }
    }
}
Change the webBrowser1.Navigate("http://localhost:6489/Default.aspx") parameter in Form1_Load to the page whose DOM after being processed by javascript you wish to obtain.
You can access the modified DOM in the CallServerSideCode() method, for example:
doc.GetElementById("myDataTable");
Or you can access the rendered HTML like this:
var renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
                        As George said in one of the comments, in theory you can just get the DOM in webBrowser1_DocumentCompleted by just using:
webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With