Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get the web page source with the rendered html from javascript

If I use this

WebClient client = new WebClient();
String htmlCode = client.DownloadString("http://test.net");

I am able to use the agility pack to scan the html and get most of the tags that I need but its missing the html that is rendered by the javascript.

My question is, how do I get the final rendered page source using c#. Is there something more to the WebClient to get the final rendered source after javascript is run?

like image 704
Hello-World Avatar asked Nov 13 '22 23:11

Hello-World


1 Answers

The HTML Agility Pack alone is not enough to do what you want, You need a javascript engine as well. To do that, you may want to check out something like Geckofx, which will allow you to embed a fully functional web browser into your application, and than allow you to programatically access the contents of the dom after the page has rendered.

http://code.google.com/p/geckofx/

like image 72
javram Avatar answered Nov 15 '22 11:11

javram