Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I prevent my asp.net site from being screen scraped? [closed]

How can I prevent my asp.net 3.5 website from being screen scraped by my competitor? Ideally, I want to ensure that no webbots or screenscrapers can extract data from my website.

Is there a way to detect that there is a webbot or screen scraper running ?

like image 680
user279521 Avatar asked Apr 24 '10 17:04

user279521


People also ask

Can you prevent screen scraping?

Use Captchas if you suspect that your website is being accessed by a scraper. Captchas ("Completely Automated Test to Tell Computers and Humans apart") are very effective against stopping scrapers.

Why do websites block scraping?

Many websites on the web do not have any anti-scraping mechanism but some of the websites do block scrapers because they do not believe in open data access.


3 Answers

It is possible to try to detect screen scrapers:

Use cookies and timing, this will make it harder for those out of the box screen scrapers. Also check for javascript support, most scrapers do not have it. Check Meta browser data to verify it is really a web browser.

You can also check for requests in a minute, a user driving a browser can only make a small number of requests per minute, so logic on the server that detects too many requests per minute could presume that screen scraping is taking place and prevent access from the offending IP address for some period of time. If this starts to affect crawlers, log the users ip that is blocked, and start allowing their IPs as needed.

You can use http://www.copyscape.com/ to proect your content also, this will at least tell you who is reusing your data.

See this question also:

Protection from screen scraping

Also take a look at

http://blockscraping.com/

Nice doc about screen scraping:

http://www.realtor.org/wps/wcm/connect/5f81390048be35a9b1bbff0c8bc1f2ed/scraping_sum_jun_04.pdf?MOD=AJPERES&CACHEID=5f81390048be35a9b1bbff0c8bc1f2ed

How to prevent screen scraping:

http://mvark.blogspot.com/2007/02/how-to-prevent-screen-scraping.html

like image 50
James Campbell Avatar answered Nov 15 '22 19:11

James Campbell


Unplug the network cable to the server.

paraphrase: if public can see it, it can be scraped.

update: upon second look it appears that I am not answering the question. Sorry. Vecdid has offered a good answer.

But any half decent coded could defeat the measures listed. In that context, my answer could be considered valid.

like image 42
Sky Sanders Avatar answered Nov 15 '22 19:11

Sky Sanders


I don't think it is possible without authenticating users to your site.

like image 2
Raj Kaimal Avatar answered Nov 15 '22 20:11

Raj Kaimal