Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disallow or Noindex on Subdomain with robots.txt

Tags:

robots.txt

I have dev.example.com and www.example.com hosted on different subdomains. I want crawlers to drop all records of the dev subdomain but keep them on www. I am using git to store the code for both, so ideally I'd like both sites to use the same robots.txt file.

Is it possible to use one robots.txt file and have it exclude crawlers from the dev subdomain?

like image 371
Kirk Ouimet Avatar asked Feb 05 '11 01:02

Kirk Ouimet


People also ask

How do I block subdomains in robots txt?

Robots. txt blocks crawling rather than indexing. So I would recommend noindex markup on your pages (assuming they provide a 200 header) then use the URL removal tool in Google Search Console to remove the entire subdomain from being visible in search.

Do subdomains need their own robots txt?

Each subdomain is generally treated as a separate site and requires their own robots. txt file.

How do I block a subdomain?

Subdomains can be blocked using URL Filtering (which requires a license). To block multiple subdomains: Add the sites into the Block List of a URL Filtering profile.


1 Answers

You could use Apache rewrite logic to serve a different robots.txt on the development domain:

<IfModule mod_rewrite.c>
    RewriteEngine on
    RewriteCond %{HTTP_HOST} ^dev\.qrcodecity\.com$
    RewriteRule ^robots\.txt$ robots-dev.txt
</IfModule>

And then create a separate robots-dev.txt:

User-agent: *
Disallow: /
like image 191
Christian Davén Avatar answered Sep 19 '22 18:09

Christian Davén