Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Masking your web scraping activities to look like normal browser surfing activities?

I'm using the Html Agility Pack and I keep getting this error. "The remote server returned an error: (500) Internal Server Error." on certain pages.

Now I'm not sure what this is, as I can use Firefox to get to these pages without any problems.

I have a feeling the website itself is blocking and not sending a response. Is there a way I can make my HTML agility pack call more like a call that is being called from FireFox?

I've already set a timer in there so it only sends to the website every 20 seconds.

Is there any other method I can use?

like image 327
Diskdrive Avatar asked Jun 05 '11 03:06

Diskdrive


1 Answers

Set a User-Agent similar to a regular browser. A User agent is a http header being passed by the http client(browser) to identify itself to the server.

like image 68
gouki Avatar answered Oct 23 '22 13:10

gouki