Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to allow crawlers access to index.php only, using robots.txt?

If i want to only allow crawlers to access index.php, will this work?

User-agent: *
Disallow: /
Allow: /index.php
like image 827
todd Avatar asked Oct 28 '09 14:10

todd


2 Answers

User-agent: *
Allow: /$
Allow: /index.php
Allow: /sitemap.xml
Allow: /robots.txt
Disallow: /

Sitemap: http://www.your-site-name.com/sitemap.xml
like image 196
mRGogo Avatar answered Oct 15 '22 04:10

mRGogo


Try swapping the order of Disallow / Allow:

User-agent: *
Allow: /index.php
Disallow: /

See this info from wikipedia:

"Yet, in order to be compatible to all robots, if you want to allow single files inside an otherwise disallowed directory, you need to place the Allow directive(s) first, followed by the Disallow, for example:"

http://en.wikipedia.org/wiki/Robots.txt

Still I wouldn't expect it to work too consistently

like image 20
UpTheCreek Avatar answered Oct 15 '22 04:10

UpTheCreek