I am trying to set up my robots.txt, but I am not sure about the regexps.
I've got four different pages all available in three different languages. Instead of listing each page times 3, I figured I could use a regexp.
nav.aspx
page.aspx/changelang (might have a query string attached such as "?toLang=fr".)
mypage.aspx?id and
login.aspx/logoff (=12346?... etc - different each time)
! All four in 3 different languages, e.g:
www.example.com/es/nav.aspx
www.example.com/it/nav.aspx
www.example.com/fr/nav.aspx
Now, my question is: Is the following regexp correct?
User-Agent: *
Disallow: /*nav\.aspx$
Disallow: /*page.aspx/changelang
Disallow: /*mypage\.aspx?id
Disallow: /*login\.aspx\/logoff
Thanks
Regular Expressions are not allowed in robots.txt, but Googlebot (and some other robots) can understands some simple pattern matching:
Your robots.txt should look like this:
User-agent: *
Disallow: /*nav.aspx$
Disallow: /*page.aspx/changelang
Disallow: /*mypage.aspx?id
Disallow: /*login.aspx/logoff
User-agent
directive is valid with lower case a
. You don't have to escape .
or `/'.
You can read more about this here: Block or remove pages using a robots.txt file
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With