Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are some methods I can use to detect robots?

Tags:

c#

asp.net

iis

Just because software is automated doesn't mean it will abide by your robots.txt. What are some methods available to detect when someone is crawling or DDOSing your website? Assume your site has 100s of 1000s of pages and is worth crawling or DDOSing.

Here's a dumb idea I had that probably doesn't work: give each user a cookie with a unique value, and use the cookie to know when someone is making second/third/etc requests. This probably doesn't work because crawlers probably don't accept cookies, and thus in this scheme a robot will look like a new user with each request.

Does anyone have better ideas?

like image 320
dan Avatar asked Jul 22 '11 04:07

dan


1 Answers

You could put links in your pages that are not visible, or clickable by end-users. Many bots just follow all links. Once someone requests one of those links you almost certainly have a crawler/robot.

like image 130
BrokenGlass Avatar answered Sep 28 '22 04:09

BrokenGlass