Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic robots.txt

Tags:

seo

Let's say I have a web site for hosting community generated content that targets a very specific set of users. Now, let's say in the interest of fostering a better community I have an off-topic area where community members can post or talk about anything they want, regardless of the site's main theme.

Now, I want most of the content to get indexed by Google. The notable exception is the off-topic content. Each thread has it's own page, but all the threads are listed in the same folder so I can't just exclude search engines from a folder somewhere. It has to be per-page. A traditional robots.txt file would get huge, so how else could I accomplish this?

like image 595
Joel Coehoorn Avatar asked Sep 04 '08 15:09

Joel Coehoorn


3 Answers

This will work for all well-behaving search engines, just add it to the <head>:

<meta name="robots" content="noindex, nofollow" />
like image 91
UnkwnTech Avatar answered Nov 08 '22 18:11

UnkwnTech


If using Apache I'd use mod-rewrite to alias robots.txt to a script that could dynamically generate the necessary content.

Edit: If using IIS you could use ISAPIrewrite to do the same.

like image 34
James Marshall Avatar answered Nov 08 '22 18:11

James Marshall


You can implement it by substituting robots.txt with dynamic script generating the output. With Apache You could make simple .htaccess rule to acheive that.

RewriteRule  ^robots\.txt$ /robots.php [NC,L]
like image 31
Ajay Prasad Avatar answered Nov 08 '22 18:11

Ajay Prasad