Goal: I'm trying to set two headers via htaccess:
X-Robots-Tag: noindex, nofollow
Location: http://example.com/foo
PoC: In PHP one could do this which works well:
header( "X-Robots-Tag: noindex, nofollow", true );
header( "Location: " . $url, 302 );
Problem: In my .htaccess
file I have this:
# Do not let robots index anything from /out/
RewriteCond %{REQUEST_URI} ^/?out/?
Header set X-Robots-Tag "noindex, nofollow"
...
# Redirect /out/example/ type links
RewriteRule ^/?out/example/(.*)$ "http://example.com/$1" [R=302,L]
I'm sure there is a simple mistake somewhere that I'm not seeing, but if I inspect the headers of, say, http://localhost/out/example/foo, the Location
header is set, but the X-Robots-Tag
is not.
HTTP/1.1 302 Found
Date: Wed, 08 Jun 2016 23:59:18 GMT
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Connection: close
Location: http://example.com/foo
...
However, triggering a 404 (e.g. http://localhost/out/404) will set the appropriate header:
HTTP/1.1 404 Not·Found
Date: Wed, 08 Jun 2016 23:56:19 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: close
Vary: Accept-Encoding,User-Agent
X-Robots-Tag: noindex, nofollow <--- set
...
Where is the problem?
A rewrite rule can be invoked in httpd. conf or in . htaccess . The path generated by a rewrite rule can include a query string, or can lead to internal sub-processing, external request redirection, or internal proxy throughput.
L|last. The [L] flag causes mod_rewrite to stop processing the rule set. In most contexts, this means that if the rule matches, no further rules will be processed.
The solution was to do the following:
# Redirect /out/example/ type links
RewriteRule ^/?out/example/(.*)$ "http://example.com/$1" [R=302,L,E=OUTLINK:1]
# Add the robots header if E was set above
Header always set X-Robots-Tag "noindex, nofollow" env=OUTLINK
Note: This was a challenge because the initial solution was adding the "noindex" header to everything which killed my site. I hope this helps someone in the future.
Apache will only set headers for success/2xx response codes. In order to to have the header set for any other status code, you need to use the always
keyword:
Header always set X-Robots-Tag "noindex, nofollow"
More Information:
When your action is a function of an existing header, you may need to specify a condition of
always
, depending on which internal table the original header was set in. The table that corresponds toalways
is used for locally generated error responses as well as successful responses. Note also that repeating this directive with both conditions makes sense in some scenarios becausealways
is not a superset ofonsuccess
with respect to existing headers:
- You're adding a header to a locally generated non-success (non-2xx) response, such as a redirect, in which case only the table corresponding to always is used in the ultimate response.
- You're modifying or removing a header generated by a CGI script, in which case the CGI scripts are in the table corresponding to always and not in the default table.
- You're modifying or removing a header generated by some piece of the server but that header is not being found by the default onsuccess condition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With