Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to block bot requests to URLs that match a common pattern in Apache?

Tags:

regex

apache

bots

I've got an apache server that gets hit about 100 times at once every 30 minutes with requests for URLs that match this pattern:

/neighborhood/****/feed

These URLs used to have content on them and used to be valid. Now they are all 404 so this bot is killing performance every time it hits us.

What do I add to my htaccess file to block it?

Note: The bot is on EC2 so blocking by IP address won't work. I need to block requests that match that pattern.

like image 866
bflora2 Avatar asked Jan 09 '11 18:01

bflora2


2 Answers

Using a mod_rewrite rule should get you to where you want to be:

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/neighborhood/[^/]+/feed$ [NC]
RewriteRule ^.*$ - [F,L]

The above goes into your .htaccess file or if you'd prefer to put it within your vhost file (because you've turned off .htaccess parsing for performance -- a good idea):

<Location />
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/neighborhood/[^/]+/feed$ [NC]
RewriteRule ^.*$ - [F,L]
</Location>

Given a URI of /neighborhood/carson/feed you should expect a response such as:

Forbidden

You don't have permission to access /neighborhood/carson/feed on this server.

Apache/2.2.16 (Ubuntu) Server at ... Port 80

This was tested on my local VM running Apache/2.2.16 on Ubuntu 10.10.

like image 83
Wil Moore III Avatar answered Sep 19 '22 15:09

Wil Moore III


The following code could be used for 404 in mod_rewrite:

RewriteRule pattern -  [R=404] [other_flags]
like image 30
kn_pavan Avatar answered Sep 17 '22 15:09

kn_pavan