Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading data from a website using C#

Tags:

c#

webpage

I have a webpage which has nothing on it except some string(s). No images, no background color or anything, just some plain text which is not really that long in length.

I am just wondering, what is the best (by that, I mean fastest and most efficient) way to pass the string in the webpage so that I can use it for something else (e.g. display in a text box)? I know of WebClient, but I'm not sure if it'll do what I want it do and plus I don't want to even try that out even if it did work because the last time I did it took approximately 30 seconds for a simple operation.

Any ideas would be appreciated.

like image 839
Iceyoshi Avatar asked Jan 21 '11 11:01

Iceyoshi


People also ask

Can you use C for websites?

LibOnion is a lightweight C library that helps create web servers in the C programming language. It's based on request handlers that it uses to process all requests. The handlers may be nested, creating onion-like layers. The first layer will generally check the server name.

Can I web scrape with C++?

C++ is highly scalable. If you start with a small project and decide that web scraping is for you, most of the code is reusable. A few tweaks here and there, and you'll be ready for much larger data volumes.

How do you read the contents of a website?

To use Read Aloud, navigate to the web page you want to read, then click the Read Aloud icon on the Chrome menu. In addition, the shortcut keys ALT-P, ALT-O, ALT-Comma, and ALT-Period can be used to Play/Pause, Stop, Rewind, and Forward. You may also select the text you want to read before activating the extension.


1 Answers

The WebClient class should be more than capable of handling the functionality you describe, for example:

System.Net.WebClient wc = new System.Net.WebClient(); byte[] raw = wc.DownloadData("http://www.yoursite.com/resource/file.htm");  string webData = System.Text.Encoding.UTF8.GetString(raw); 

or (further to suggestion from Fredrick in comments)

System.Net.WebClient wc = new System.Net.WebClient(); string webData = wc.DownloadString("http://www.yoursite.com/resource/file.htm"); 

When you say it took 30 seconds, can you expand on that a little more? There are many reasons as to why that could have happened. Slow servers, internet connections, dodgy implementation etc etc.

You could go a level lower and implement something like this:

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.yoursite.com/resource/file.htm");  using (StreamWriter streamWriter = new StreamWriter(webRequest.GetRequestStream(), Encoding.UTF8)) {     streamWriter.Write(requestData); }  string responseData = string.Empty; HttpWebResponse httpResponse = (HttpWebResponse)webRequest.GetResponse(); using (StreamReader responseReader = new StreamReader(httpResponse.GetResponseStream())) {     responseData = responseReader.ReadToEnd(); } 

However, at the end of the day the WebClient class wraps up this functionality for you. So I would suggest that you use WebClient and investigate the causes of the 30 second delay.

like image 173
MrEyes Avatar answered Oct 09 '22 04:10

MrEyes