Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this robots.txt syntax with an empty "Disallow:" correct?

Tags:

robots.txt

Today whilst improving my web crawler to support the robots.txt standard, I came across the following code at http://www.w3schools.com/robots.txt

User-agent: Mediapartners-Google 
Disallow: 

Is this syntax correct? Shouldn't it be Disallow: / or Allow: / depending on the intended purpose?

like image 320
dangee1705 Avatar asked Apr 04 '16 14:04

dangee1705


People also ask

What does an empty robots.txt mean?

An empty Disallow line means you're not disallowing anything so that a spider can access all sections of your site. The example below would block all search engines that “listen” to robots. txt from crawling your site.

How do I fix robots.txt error?

Luckily, there's a simple fix for this error. All you have to do is update your robots. txt file (example.com/robots.txt) and allow Googlebot (and others) to crawl your pages. You can test these changes using the Robots.

What is robots.txt and its syntax?

A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google. To keep a web page out of Google, block indexing with noindex or password-protect the page.

What does User-Agent * Disallow mean?

The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.


1 Answers

Disallow:

Will allow everything, as will:

Allow: /

You're either disallowing nothing, or allowing everything.

like image 190
Ralph King Avatar answered Oct 05 '22 12:10

Ralph King