Basically, I'm trying to remove search engine crawlers such as Google, Bing, and what not from my access logs. They really build up over time, eventually adding hundreds of thousands of useless access log entries to the logs, this is especially a pain if you ever have to search through them. The trouble I'm having is that in my blocks, I'm defining the access log, therefore Nginx is just looking at that and ignoring my 2nd one that I define in the location / block. If I comment out my access log for my site (not the crawler block) then it works fine. Here is the configuration:
server {
listen 80;
server_name example.com;
access_log /home/domains/example.com/logs/access;
error_log /home/domains/example.com/logs/error;
root /home/domains/example.com/forums;
location / {
index index.html index.htm;
if ($http_user_agent ~* ("googlebot") ) {
access_log off;
}
}
I've removed everything except that upon posting (php include, and what not), though I've checked that nothing is interfering it by commenting everything out everything except what is above. So to sum it up, I have a log defined in my virtual block to log all of the traffic (I have it defined for every block, to make it neater and what not. I'm trying to disable logging for certain user agents, unless I disable the main log for the site, it'll continue logging what I tell it not to for the user agents.
I've been at this for a few hours now, any help will be greatly appreciated.
You should not use if
statements in nginx
- if is evil
Use conditional logging:
http {
map $http_user_agent $excluded_ua {
~Googlebot 0;
default 1;
}
.......
}
server {
access_log /home/domains/example.com/logs/access combined if=$excluded_ua;
}
However be careful about excluding googlebot
as some abusive bots disguise themselves.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With