Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct nginx configuration to prevent indexing of some folders

I'm using the following Nginx configuration to prevent the indexing of content in some of my folders when I use the x-robots tag

location ~ .*/(?:archive|filter|topic)/.* {
    add_header X-Robots-Tag "noindex, follow";      
}

The content remains indexed but I can't debug the Nginx configuration.

My questions: is the configuration I use correct and if I should wait till googlebot re-crawls content and de-indexes the content? Or is my configuration wrong?

like image 705
Evgeniy Avatar asked Mar 29 '17 13:03

Evgeniy


People also ask

How do I know if nginx config is correct?

Through a simple command you can verify the status of the Nginx configuration file: $ sudo systemctl config nginx The output will show if the configuration file is correct or, if it is not, it will show the file and the line where the problem is.

In which directory are nginx server block configuration files stored?

By default the file is named nginx. conf and for NGINX Plus is placed in the /etc/nginx directory. (For NGINX Open Source , the location depends on the package system used to install NGINX and the operating system. It is typically one of /usr/local/nginx/conf, /etc/nginx, or /usr/local/etc/nginx.)

What is the default nginx config?

By default, the configuration file is named nginx. conf and placed in the directory /usr/local/nginx/conf , /etc/nginx , or /usr/local/etc/nginx .


1 Answers

The configuration you've written is correct. I'd give one caveat (assuming your config is otherwise standard):

It will only output the X-Robots-Tag when the result code is 200, 201, 204, 206, 301, 302, 303, 304, or 307 (e.g. content matches a disk file, a redirect is issued, etc.). So if you have an /archive/index.html, a hit to http://yoursite.com/archive/ will give the header. If the index.html does not exist (404), you won't see the tag.

The always parameter will output the header for all response codes, assuming the location block is processed:

location ~ .*/(?:archive|filter|topic)/.* {
    add_header X-Robots-Tag "noindex, follow" always;      
}

Another option will guarantee the header is output on a URI match. This is useful for when there's a chance that a location block may not get processed (due to short-circuiting, like with return or a last on a rewrite etc):

http {
    ...
    map $request_uri $robot_header {
        default "";
        ~.*/(?:archive|filter|topic)/.* "noindex, follow";
    }

    server {
        ...
        add_header X-Robots-Tag $robot_header;
        ...
    }
like image 154
Allen Luce Avatar answered Sep 21 '22 04:09

Allen Luce