Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with botnets and automated submissions

Short story: I have a web application that has a huge incentive for participation. As such, we're being targeted heavily by the scripters and bots. Based on the IP addresses the submissions are coming from (1000+ and growing, no pattern whatsoever), I'm inclined to believe the submissions are being generated by a bot network. Even worse, the person(s) controlling the automated submissions are actively persuing things to the point that every time we make a change, they catch up within a few hours.

Some of the measures we've tried already:

  • Captcha, both third party and home-grown, with varying degrees of readability
  • An anti-request forgery token sent via cookie and hidden form field that is compared upon submit
  • A hidden empty honeypot field that causes the submission to fail silently if the field contains data
  • A hidden honeypot field that contains data by default and results in a silent fail if a piece of javascript does not run to clear the field's value
  • Limiting submissions by IP address over a certain time period
  • Blocking email domains known to be used by the automated scripts
  • Blocking hosts based on simultaneous connections or connections per minute at the firewall
  • Blocking the most flagrant IP addresses at the firewall
  • Using an external address verification service to verify incoming addresses

Even with all of these measures in place, the submissions have not only continued, but seem to be increasing in frequency, on the order of 100,000+ per day.

The bogus entries are now using completely valid first and last names, and apparently have resorted to using some sort of directory listing to ensure that the addresses they use (which appear totally random and not at all consistent, btw) are actually valid US postal addresses. Additionally, I have logged the incoming form values to a debug log and verified that they are actually submitting valid captcha codes, indicating they have OCR good enough to decipher the images (the code itself is never sent to the client, only a GUID representing a code that is stored elsewhere on the back end)

In fact, the only way we can even tell the entries are bogus is by the pattern of email addresses and domains they are submitting. We've tried blocking the most active domains from entering, but the spammers just create or find new domains from which they can generate disposable email addresses and keep on going.

I'm pretty exhausted at this point, but I'm sure there's got to be something I haven't tried. Does anyone here have any bright ideas?

like image 560
Chris Avatar asked Jul 31 '11 22:07

Chris


1 Answers

The problem is: because of only 'registering' to your site, the user receives too many rights at once. The user is trusted "too fast".

Look at stackoverflow - you can register, and you gain almost no rights at the beginning. User permissions level increases after some time, because the trust to the user increases, because of what the user is doing, and other users accept that.

I would focus on making users "trust" a kind of "build-able resource" where other users have to confirm "authority level" of a particular user. Then auto-registering of users would make no meaning - they can do nothing.

I don't know what your site is about - that probably makes my suggestion not acceptable... But I hope I made your thoughts go forward :)

like image 113
Roman Pietrzak Avatar answered Sep 19 '22 17:09

Roman Pietrzak