If I use this
WebClient client = new WebClient();
String htmlCode = client.DownloadString("http://test.net");
I am able to use the agility pack to scan the html and get most of the tags that I need but its missing the html that is rendered by the javascript.
My question is, how do I get the final rendered page source using c#. Is there something more to the WebClient to get the final rendered source after javascript is run?
The HTML Agility Pack alone is not enough to do what you want, You need a javascript engine as well. To do that, you may want to check out something like Geckofx, which will allow you to embed a fully functional web browser into your application, and than allow you to programatically access the contents of the dom after the page has rendered.
http://code.google.com/p/geckofx/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With