I came across a website which seems simple enough that I was pretty confident that I will be able to read its data using HttpWebRequest and will be able to do the GET and POST requests. The GET requests are working fine. POST request also not generating any error but still the posted form data has no effect on the results which are returned. The form data posted have fields to filter the data as per dates but regardless the fact that every required data is posted the data returned is not filtered. I have added every header, form data and also added cookies with the request.
The url for the webpage is http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0
This seems like a very ordinary website but as it is an aspx page and involves ViewState and Event Validation hence this was expected not to be very easy.
My first step was to analyze the site's GET and POST using Fiddler and this amazes me because Fiddler is not capturing any traffic for this url. I have tried Charles but that itself is not capturing this url. Other then this Url Fiddler and Charles both are capturing everything else. I also like to mention that when I called the Url from a console application using HttpWebRequest then both Fiddler and Charles captured it but they are not capturing it from Chrome, FireFox and Internet Explorer 11.
So I analyzed the Network activity using Developer tool in FireFox and everything was visible which includes (Headers, Parameters and Cookies). In Chrome no cookies were present. When I inspect the cookies by creating HttpWebRequest and got the response there were no cookies present. So something is really strange going o with this website.
I have somehow managed to create a simple function to create the request and get the response. What I am doing is that I am creating a GET request first and get the Website string and extract Viewstate, EventValidation etc from it. I use this information to be used in second HttpWebRequest which is a post. Now everything works fine and I get the response but not as expected. I want the records between two give dates and I have specified these dates in the form data but still the POST request does not return the filtered data. I have mentioned the function that I have created below and I will really appreciate any suggestions that why is this happening and how to handle this. To understand this has become a challenge to me as I cannot understand why this simple website is not showing up in Fiddler. (This uses Javascript Postback)
The code may look long and scary but rather it is very simple and straight forward.
Try
' First GET Request to obtain Viewstate, Eventvalidation etc
Dim objRequest2 As Net.HttpWebRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest)
objRequest2.Method = "GET"
objRequest2.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
objRequest2.Headers.Add("Accept-Encoding", "gzip, deflate")
objRequest2.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4")
objRequest2.KeepAlive = True
objRequest2.ContentType = "application/x-www-form-urlencoded"
objRequest2.Host = "www.bseindia.com"
objRequest2.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
objRequest2.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip
Dim LoginRes2 As Net.HttpWebResponse
Dim sr2 As IO.StreamReader
LoginRes2 = objRequest2.GetResponse()
sr2 = New IO.StreamReader(LoginRes2.GetResponseStream)
Dim getString As String = sr2.ReadToEnd()
Dim getCookieCollection = objRequest2.CookieContainer
' get the page ViewState
Dim viewStateFlag As String = "id=""__VIEWSTATE"" value="""
Dim i As Integer = getString.IndexOf(viewStateFlag) + viewStateFlag.Length
Dim j As Integer = getString.IndexOf("""", i)
Dim viewState As String = getString.Substring(i, j - i)
' get page EventValidation
Dim eventValidationFlag As String = "id=""__EVENTVALIDATION"" value="""
i = getString.IndexOf(eventValidationFlag) + eventValidationFlag.Length
j = getString.IndexOf("""", i)
Dim eventValidation As String = getString.Substring(i, j - i)
' get page EventValidation
Dim viewstateGeneratorFlag As String = "id=""__VIEWSTATEGENERATOR"" value="""
i = getString.IndexOf(viewstateGeneratorFlag) + viewstateGeneratorFlag.Length
j = getString.IndexOf("""", i)
Dim viewStateGenerator As String = getString.Substring(i, j - i)
viewState = System.Web.HttpUtility.UrlEncode(viewState)
eventValidation = System.Web.HttpUtility.UrlEncode(eventValidation)
Dim LoginRes As Net.HttpWebResponse
Dim sr As IO.StreamReader
Dim objRequest As Net.HttpWebRequest
' Second POST request to post the form data along with cookies
objRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest)
Dim formDataCollection As New NameValueCollection
formDataCollection.Add("__EVENTTARGET", "")
formDataCollection.Add("__EVENTARGUMENT", "")
formDataCollection.Add("__VIEWSTATE", viewState)
formDataCollection.Add("__VIEWSTATEGENERATOR", viewStateGenerator)
formDataCollection.Add("__EVENTVALIDATION", eventValidation)
formDataCollection.Add("fmdate", "20160104")
formDataCollection.Add("eddate", "20160204")
formDataCollection.Add("hidCurrentDate", "2016/02/04")
formDataCollection.Add("ctl00_ContentPlaceHolder1_hdnCode", "")
formDataCollection.Add("txtDate", "04/01/2016")
formDataCollection.Add("ddlCalMonthDiv3", "1")
formDataCollection.Add("ddlCalYearDiv3", "2016")
formDataCollection.Add("txtTodate", "04/02/2016")
formDataCollection.Add("ddlCalMonthDiv4", "2")
formDataCollection.Add("ddlCalYearDiv4", "2016")
formDataCollection.Add("Hidden1", "")
formDataCollection.Add("ctl00_ContentPlaceHolder1_GetQuote1_smartSearch", "Enter Security Name / Code / ID")
formDataCollection.Add("btnSubmit.x", "44")
formDataCollection.Add("btnSubmit.y", "2")
Dim strFormdata As String = formDataCollection.ToString()
Dim encoding As New ASCIIEncoding
Dim postBytes As Byte() = encoding.GetBytes(strFormdata)
objRequest.Method = "POST"
objRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
objRequest.Headers.Add("Accept-Encoding", "gzip, deflate")
objRequest.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4")
objRequest.Headers.Add("Cache-Control", "private, max-age=60")
objRequest.KeepAlive = True
objRequest.ContentType = "application/x-www-form-urlencoded"
objRequest.Host = "www.bseindia.com"
objRequest.Headers.Add("Origin", "http://www.bseindia.com")
objRequest.Referer = "http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"
objRequest.Headers.Add("Upgrade-Insecure-Requests", "1")
objRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
objRequest.ContentType = "text/html; charset=utf-8"
objRequest.Date = "Thu, 04 Feb 2016 13:42:04 GMT"
objRequest.Headers.Add("Server", "Microsoft-IIS/8.0")
objRequest.Headers.Add("Vary", "Accept-Encoding")
objRequest.Headers.Add("X-AspNet-Version", "2.0.50727")
objRequest.Headers.Add("ASP.NET", "ASP.NET")
objRequest.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip
Dim gaCookies As New CookieContainer()
Dim cookie1 As New Cookie("__asc", "f673f0d5152a823bc335f575d34")
cookie1.Domain = ".bseindia.com"
cookie1.Path = "/"
gaCookies.Add(cookie1)
Dim cookie2 As New Cookie("__auc", "f673f0d5152a823bc335f575d34")
cookie2.Domain = ".bseindia.com"
cookie2.Path = "/"
gaCookies.Add(cookie2)
Dim cookie3 As New Cookie("__utma", "253454874.280640365.1454519857.1454519865.1454519865.1")
cookie3.Domain = ".bseindia.com"
cookie3.Path = "/"
gaCookies.Add(cookie3)
Dim cookie4 As New Cookie("__utmb", "253454874.1.10.1454519865")
cookie4.Domain = ".bseindia.com"
cookie4.Path = "/"
gaCookies.Add(cookie4)
Dim cookie5 As New Cookie("__utmc", "253454874")
cookie5.Domain = ".bseindia.com"
cookie5.Path = "/"
gaCookies.Add(cookie5)
Dim cookie6 As New Cookie("__utmt", "1")
cookie6.Domain = ".bseindia.com"
cookie6.Path = "/"
gaCookies.Add(cookie6)
Dim cookie7 As New Cookie("__utmz", "253454874.1454519865.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)")
cookie7.Domain = ".bseindia.com"
cookie7.Path = "/"
gaCookies.Add(cookie7)
Dim cookie8 As New Cookie("_ga", "GA1.2.280640365.1454519857")
cookie8.Domain = ".bseindia.com"
cookie8.Path = "/"
gaCookies.Add(cookie8)
Dim cookie9 As New Cookie("_gat", "1")
cookie9.Domain = ".bseindia.com"
cookie9.Path = "/"
gaCookies.Add(cookie9)
Dim postStream As Stream = objRequest.GetRequestStream()
postStream.Write(postBytes, 0, postBytes.Length)
postStream.Flush()
postStream.Close()
LoginRes = objRequest.GetResponse()
sr = New IO.StreamReader(LoginRes.GetResponseStream)
ReadWebsite = sr.ReadToEnd()
sr.Close()
sr = Nothing
LoginRes.Close()
LoginRes = Nothing
objRequest = Nothing
Exit Function
Catch ex As Exception
ReadWebsite = Nothing
End Try
Note: (Raw form data for dates without viewstate and eventvalidation)
fmdate:20160130 eddate:20160205 hidCurrentDate:2016/02/05 ctl00_ContentPlaceHolder1_hdnCode: txtDate:04/01/2016 ddlCalMonthDiv3:1 ddlCalYearDiv3:2016 txtTodate:04/02/2016 ddlCalMonthDiv4:2 ddlCalYearDiv4:2016 Hidden1: ctl00_ContentPlaceHolder1_GetQuote1_smartSearch:Enter Security Name / Code / ID btnSubmit.x:55 btnSubmit.y:13
You could consider running the site in a browser and using a tool to control the browser instead directly issuing GET/POST requests. This may be easier and slightly more robust than your current approach.
E.g. Selenium Web Driver http://www.seleniumhq.org/projects/webdriver/
You would load the page, set the values of the form fields (using css style selectors to find the appropriate fields) and then click the button. You can automate all of this and get the page source (unfortunately I don't think you can get the full html in it's current state, after javascript has run, but potentially you can use the api to get the elements you need).
Api documentation: http://seleniumhq.github.io/selenium/docs/api/dotnet/
You indeed should include ALL fields from the form, including hidden ones and ASP session identifier that is stored in cookies. That way you fully emulate browser' request and achieve your goal. To show what you have to submit - http://pastebin.com/AsSABgU6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With