Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to detect bot from user agent?

Tags:

php

search

Time goes by, but still no perfect solution... See if someone has a bright idea to differentiate bot from human-loaded web page? State of the art is still loading a long list of well-known SE bots and parse USER AGENT?

Testing has to be done before the page is loaded! No gifs or captchas!

like image 538
Riccardo Avatar asked Nov 05 '10 14:11

Riccardo


People also ask

How do you detect bots?

Server-side detection can be enough to identify basic bots, but it cannot identify advanced bots with consistent HTTP, TLC, and TLS fingerprints. Client-side detection uses techniques such as browser tracking, app tracking, and user event tracking to detect significantly more advanced bots.

How do bot detection services work?

Almost all of the bot detection services, use a combination of Browser side Detection with Server Side Detection to accurately block bots. The first thing that happens when a site starts client side detection is that all scrapers that are not a real browser will get blocked immediately.

How do bot mitigation companies identify bots?

Bot mitigation companies and products try to identify non-human or bot traffic from all the traffic that a website receives. The least sophisticated bots are easy to identify and as the bots get more sophisticated, it becomes much harder to accurately identify a bot from a human. How do websites detect web scrapers and other bots?

Why are bots so difficult to detect?

These services give malicious operators the ability to set up a botnet and send bots to a particular website or app. Because these services are set up so their users only pay for successful requests, they are incentivized to make their bots as advanced as possible. All these factors combined make bot detection incredibly challenging.


1 Answers

If possible, I would try a honeypot approach to this one. It will be invisible to most users, and will discourage many bots, though none that are determined to work, as they could implement special code for your site that just skipped the honeypot field once they figure out your game. But it would take a lot more attention by the owners of the bot than is probably worth it for most. There will be tons of other sites accepting spam without any additional effort on their part.

One thing that gets skipped over from time to time is it is important to let the bot think that everything went fine, no error messages, or denial pages, just reload the page as you would for any other user, except skip adding the bots content to the site. This way there are no red flags that can be picked up in the bots logs, and acted upon by the owner, it will take much more scrutiny to figure out you are disallowing the comments.

like image 138
Matthew Vines Avatar answered Oct 06 '22 23:10

Matthew Vines