I am trying to scrape a website to get the Textarea
information.
I'm using:
HtmlDocument doc = this.webBrowser1.Document;
When I look at the view source it shows <textarea name="message" class="profile">
But when I try to access this textarea with:
HtmlDocument doc = this.webBrowser1.Document;
doc.GetElementsByTagName("textarea")
.GetElementsByName("message")[0]
.SetAttribute("value", "Hello");
It shows the error:
Value of '0' is not valid for 'index'. 'index' should be between 0 and -1.
Parameter name: index
Any Help?
For your current need you can simply use this:
doc.GetElementsByTagName("textarea")[0].InnerText = "Hello";
For complex things you can use HtmlDocument class with MSHTML class.
I can entrust HtmlAgilityPack to you!
I'd like to think that you try to access a website that uses cookies to determine if a user is logged in (or not). If not, it will force you to register/log-in else you aren't allowed to see anything. Am I right?
Your browser stores that cookies, your C# does not! (broadly speaking)
You need to create a cookie container to solve that problem.
Your C#-App may log-in, request a cookie/session, may grab the Cookies from the responseheader and then you should be able to scrape the profiles or whatever you want.
Get the Post Data, which is send to server. You can use tools/addons like Fiddler, Tamper, ect..
E.g. PostdataString: user_name=TESTUSER&password=TESTPASSWORD&language=en&action%3Asubmit=Submit
Here is a snippet you can use.
//Create the PostData
string strPostData = "user_name=" + txtUser.Text + "&password=" + txtPass.Text + "&language=en&action%3Asubmit=Submit";
CookieContainer tempCookies = new CookieContainer();
ASCIIEncoding encoding = new ASCIIEncoding();
byte[] data = encoding.GetBytes(strPostData);
//Create the Cookie
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.website.com/login.php");
request.Method = "POST";
request.KeepAlive = true;
request.AllowAutoRedirect = false;
request.Accept = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.ContentType = "application/x-www-form-urlencoded";
request.Referer = "http://www.website.com/login.php";
request.UserAgent = "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1";
request.ContentLength = data.Length;
Stream requestStream = request.GetRequestStream();
requestStream.Write(data, 0, data.Length);
HttpWebResponse response;
response = (HttpWebResponse)request.GetResponse();
string sRequestHeaderBuffer = Convert.ToString(response.Headers);
requestStream.Close();
//Stream(-output) of the new website
StreamReader postReqReader = new StreamReader(response.GetResponseStream());
//RichTextBox to see the new source.
richTextBox1.Text = postReqReader.ReadToEnd();
You will need to adjust the Cookie-parameters in between and add your current sessionid aswell to the code. This depends on the requested website you visit.
E.g.:
request.Headers.Add("Cookie", "language=en_US.UTF-8; StationID=" + sStationID + "; SessionID=" + sSessionID);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With