Forgive my ignorance on the subject I am using <pre class="prettyprint"><code> string p="http://" + Textbox2.text; string r= textBox3.Text; System.Net.WebClient webclient=new System.Net.Webclient(); webclient.DownloadFile(p,r); </code></pre> to download a webpage. Can you please help me with enhancing the code so that it downloads the entire website. Tried using HTML Screen Scraping but it returns me only the href links of the index.html files. How do i proceed ahead Thanks

Scraping a website is actually a lot of work, with a lot of corner cases. Invoke wget instead. The manual explains how to use the "recursive retrieval" options.

<pre class="prettyprint"><code> protected string GetWebString(string url) { string appURL = url; HttpWebRequest wrWebRequest = WebRequest.Create(appURL) as HttpWebRequest; HttpWebResponse hwrWebResponse = (HttpWebResponse)wrWebRequest.GetResponse(); StreamReader srResponseReader = new StreamReader(hwrWebResponse.GetResponseStream()); string strResponseData = srResponseReader.ReadToEnd(); srResponseReader.Close(); return strResponseData; } </code></pre> This puts the webpage into a string from the supplied URL. You can then use REGEX to parse through the string. This little piece gets specific links out of craigslist and adds them to an arraylist...Modify to your purpose. <pre class="prettyprint"><code> protected ArrayList GetListings(int pages) { ArrayList list = new ArrayList(); string page = GetWebString("http://albany.craigslist.org/bik/"); MatchCollection listingMatches = Regex.Matches(page, "(<a href=\")(?<LINK>/.+/.+[.]html)(\">)(?<TITLE>.*)(-</a>)"); foreach (Match m in listingMatches) { list.Add("http://albany.craigslist.org" + m.Groups["LINK"].Value.ToString()); } return list; } </code></pre>

Download an Entire Website in C#

Forgive my ignorance on the subject

I am using

 string p="http://" + Textbox2.text;
 string r= textBox3.Text;
 System.Net.WebClient webclient=new
 System.Net.Webclient();
 webclient.DownloadFile(p,r);

to download a webpage. Can you please help me with enhancing the code so that it downloads the entire website. Tried using HTML Screen Scraping but it returns me only the href links of the index.html files. How do i proceed ahead

Thanks

How can I download an entire website?

Open the three-dot menu on the top right and select More Tools > Save page as. You can also right-click anywhere on the page and select Save as or use the keyboard shortcut Ctrl + S in Windows or Command + S in macOS. Chrome can save the complete web page, including text and media assets, or just the HTML text.

Scraping a website is actually a lot of work, with a lot of corner cases.

Invoke wget instead. The manual explains how to use the "recursive retrieval" options.

 protected string GetWebString(string url)
    {
        string appURL = url;
        HttpWebRequest wrWebRequest = WebRequest.Create(appURL) as HttpWebRequest;
        HttpWebResponse hwrWebResponse = (HttpWebResponse)wrWebRequest.GetResponse();

        StreamReader srResponseReader = new StreamReader(hwrWebResponse.GetResponseStream());
        string strResponseData = srResponseReader.ReadToEnd();
        srResponseReader.Close();
        return strResponseData;
    }

This puts the webpage into a string from the supplied URL.

You can then use REGEX to parse through the string.

This little piece gets specific links out of craigslist and adds them to an arraylist...Modify to your purpose.

 protected ArrayList GetListings(int pages)
    {
            ArrayList list = new ArrayList();
            string page = GetWebString("http://albany.craigslist.org/bik/");

            MatchCollection listingMatches = Regex.Matches(page, "(<p><a href=\")(?<LINK>/.+/.+[.]html)(\">)(?<TITLE>.*)(-</a>)");
            foreach (Match m in listingMatches)
            {
                list.Add("http://albany.craigslist.org" + m.Groups["LINK"].Value.ToString());
            }
            return list;
    }

Download an Entire Website in C#

Tags:

c#

screen

download

web

screen-scraping

Karthik

People also ask

2 Answers

Will

Jason

Recent Activity

Donate For Us

Download an Entire Website in C#

Tags:

c#

screen

download

web

screen-scraping

Karthik

People also ask

2 Answers

Will

Jason

Related questions

Recent Activity

Donate For Us