Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternative to WebClient

Tags:

c#

webclient

I've just seen a web crawler in action on my computer and it downloads like thousands of metatag info in only a few minutes.

And when I use WebClient to download pages and then parse them locally, why does it take WebClient about 40seconds just to download a single webpage? Is there an alternative to downloading webpages?

thanks:)

like image 520
jay_t55 Avatar asked Apr 28 '26 07:04

jay_t55


1 Answers

A few things to consider:

  • How many pages are you downloading at once? Web crawlers tend to work in a highly parallel way.
  • By default the .NET framework restricts the number of parallel requests to a single site. That's generally a nice thing to do - you may want to raise the limit a bit, but ideally target different sites in parallel. The <connectionManagement> element is the one you need to look at.
  • Have you used WireShark to see what's going on at the network level? If the web site is taking 40 seconds to serve the page, it's hard to see how changing from using WebClient would help.
  • Could you post some code to show exactly what you're doing?

It's possible that using a different API (possibly even just WebRequest) will speed things up, but you really need to find the current bottleneck first.

like image 90
Jon Skeet Avatar answered Apr 30 '26 20:04

Jon Skeet



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!