I have dev.example.com and www.example.com hosted on different subdomains. I want crawlers to drop all records of the dev
subdomain but keep them on www
. I am using git to store the code for both, so ideally I'd like both sites to use the same robots.txt file.
Is it possible to use one robots.txt file and have it exclude crawlers from the dev
subdomain?
Robots. txt blocks crawling rather than indexing. So I would recommend noindex markup on your pages (assuming they provide a 200 header) then use the URL removal tool in Google Search Console to remove the entire subdomain from being visible in search.
Each subdomain is generally treated as a separate site and requires their own robots. txt file.
Subdomains can be blocked using URL Filtering (which requires a license). To block multiple subdomains: Add the sites into the Block List of a URL Filtering profile.
You could use Apache rewrite logic to serve a different robots.txt
on the development domain:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^dev\.qrcodecity\.com$
RewriteRule ^robots\.txt$ robots-dev.txt
</IfModule>
And then create a separate robots-dev.txt
:
User-agent: *
Disallow: /
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With