Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asterisk in robots.txt [closed]

Tags:

seo

robots.txt

Wondering if following will work for google in robots.txt

Disallow: /*.action

I need to exclude all urls ending with .action.

Is this correct?

like image 852
Alexey Avatar asked Mar 12 '10 18:03

Alexey


People also ask

What does asterisk mean in robots txt?

Crawlers and bots have specific names with which they can be recognized on a server. A note in the robots. txt file can lay out which crawlers much follow which commands. An asterisk (*) denotes a rule for all bots. Google uses various user agents to crawl the internet, the most important of which is the “Googlebot.”

How do I fix robots txt error?

Luckily, there's a simple fix for this error. All you have to do is update your robots. txt file (example.com/robots.txt) and allow Googlebot (and others) to crawl your pages. You can test these changes using the Robots.

When * and wildcards should be used in robots txt?

While typical formatting in robots. txt will prevent the crawling of the pages in a directory or a specific URL, using wildcards in your robots. txt file will allow you to prevent search engines from accessing content based on patterns in URLs – such as a parameter or the repetition of a character.

What does disallow mean in robots txt?

Disallow directive in robots. txt. You can tell search engines not to access certain files, pages or sections of your website. This is done using the Disallow directive. The Disallow directive is followed by the path that should not be accessed.


2 Answers

To block files of a specific file type (for example, .gif), use the following:

User-agent: Googlebot
Disallow: /*.gif$

So, you are close. Use Disallow: /*.action$ with a trailing "$"

Of course, that's merely what Google suggests: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449

All bots are different.

like image 183
Ben Griswold Avatar answered Nov 11 '22 14:11

Ben Griswold


The robots.txt specification provides no way to include wildcards, only the beginning of URIs.

Google implement non-standard extensions, described in their documentation (look in the Manually create a robots.txt file section under "To block files of a specific file type").

like image 34
Quentin Avatar answered Nov 11 '22 15:11

Quentin