I would like to scrape some contents from a dynamic web page (seems it is developed in MVC).
Data scraping logics are done with HTML agility, but now the issue is, HTML returned while requesting for URL from browser and web response of the URL from ASP.NET web request is different.
Mainly browser response has dynamic data I need (renders based on the value passed in query string), but the WebResponse
result is different.
Could you please help me to get the actual content of the dynamic web page view WebRequest
.
Below is the code I used to read:
WebRequest request = WebRequest.Create(sURL);
request.Method = "Get";
//Get the response
WebResponse response = request.GetResponse();
//Read the stream from the response
StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.UTF8);
To get the content of any web page using HttpWebRequest
...
// We will store the html response of the request here
string siteContent = string.Empty;
// The url you want to grab
string url = "http://google.com";
// Here we're creating our request, we haven't actually sent the request to the site yet...
// we're simply building our HTTP request to shoot off to google...
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.AutomaticDecompression = DecompressionMethods.GZip;
// Right now... this is what our HTTP Request has been built in to...
/*
GET http://google.com/ HTTP/1.1
Host: google.com
Accept-Encoding: gzip
Connection: Keep-Alive
*/
// Wrap everything that can be disposed in using blocks...
// They dispose of objects and prevent them from lying around in memory...
using(HttpWebResponse response = (HttpWebResponse)request.GetResponse()) // Go query google
using(Stream responseStream = response.GetResponseStream()) // Load the response stream
using(StreamReader streamReader = new StreamReader(responseStream)) // Load the stream reader to read the response
{
siteContent = streamReader.ReadToEnd(); // Read the entire response and store it in the siteContent variable
}
// magic...
Console.WriteLine (siteContent);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With