Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I detect bots programmatically

Tags:

asp.net

bots

We have a situation where we log visits and visitors on page hits and bots are clogging up our database. We can't use captcha or other techniques like that because this is before we even ask for human input, basically we are logging page hits and we would like to only log page hits by humans.

Is there a list of known bot IP out there? Does checking known bot user-agents work?

like image 743
Tom DeMille Avatar asked May 05 '10 19:05

Tom DeMille


People also ask

How can bots be detected?

How can bot traffic be identified? Web engineers can look directly at network requests to their sites and identify likely bot traffic. An integrated web analytics tool, such as Google Analytics or Heap, can also help to detect bot traffic.

What does bot detected mean?

Bot detection mitigates scripted attacks by detecting when a request is likely to be coming from a bot. These types of attacks are sometimes called credential stuffing attacks or list validation attacks. It provides protection against certain attacks that adds very little friction to legitimate users.


1 Answers

There is no sure-fire way to catch all bots. A bot could act just like a real browser if someone wanted that.

Most serious bots identify themselves clearly in the agent string, so with a list of known bots you can fitler out most of them. To the list you can also add some agent strings that some HTTP libraries use by default, to catch bots from people who don't even know how to change the agent string. If you just log the agent strings of visitors, you should be able to pick out the ones to store in the list.

You can also make a "bad bot trap" by putting a hidden link on your page that leads to a page that's filtered out in your robots.txt file. Serious bots would not follow the link, and humans can't click on it, so only bot that doesn't follow the rules request the file.

like image 172
Guffa Avatar answered Sep 17 '22 19:09

Guffa