Let's say I have a web site for hosting community generated content that targets a very specific set of users. Now, let's say in the interest of fostering a better community I have an off-topic area where community members can post or talk about anything they want, regardless of the site's main theme.
Now, I want most of the content to get indexed by Google. The notable exception is the off-topic content. Each thread has it's own page, but all the threads are listed in the same folder so I can't just exclude search engines from a folder somewhere. It has to be per-page. A traditional robots.txt file would get huge, so how else could I accomplish this?
This will work for all well-behaving search engines, just add it to the <head>
:
<meta name="robots" content="noindex, nofollow" />
If using Apache I'd use mod-rewrite to alias robots.txt to a script that could dynamically generate the necessary content.
Edit: If using IIS you could use ISAPIrewrite to do the same.
You can implement it by substituting robots.txt with dynamic script generating the output. With Apache You could make simple .htaccess rule to acheive that.
RewriteRule ^robots\.txt$ /robots.php [NC,L]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With