Today whilst improving my web crawler to support the robots.txt standard, I came across the following code at http://www.w3schools.com/robots.txt <pre class="prettyprint"><code>User-agent: Mediapartners-Google Disallow: </code></pre> Is this syntax correct? Shouldn't it be <code>Disallow: /</code> or <code>Allow: /</code> depending on the intended purpose?

<pre class="prettyprint"><code>Disallow: </code></pre> Will allow everything, as will: <pre class="prettyprint"><code>Allow: / </code></pre> You're either disallowing nothing, or allowing everything.

Is this robots.txt syntax with an empty "Disallow:" correct?

Tags:

robots.txt

Today whilst improving my web crawler to support the robots.txt standard, I came across the following code at http://www.w3schools.com/robots.txt

User-agent: Mediapartners-Google 
Disallow:

Is this syntax correct? Shouldn't it be Disallow: / or Allow: / depending on the intended purpose?

320

asked Apr 04 '16 14:04

dangee1705

1 Answers

Disallow:

Will allow everything, as will:

Allow: /

You're either disallowing nothing, or allowing everything.

190

answered Oct 05 '22 12:10

Ralph King

Related questions
                            
                                How can i fix "Googlebot can't access your site" issue?
                            
                                Block bingbot from crawling my site
                            
                                Why does Chrome request a robots.txt?
                            
                                Should I use different case-spellings for case-insensitive directories in robots.txt?
                            
                                Robots.txt file in MVC.NET 4
                            
                                Nginx: different robots.txt for alternate domain
                            
                                How to add route to dynamic robots.txt in ASP.NET MVC?
                            
                                Sitemap for a site with a large number of dynamic subdomains
                            
                                How to allow crawlers access to index.php only, using robots.txt?
                            
                                Can I use the “Host” directive in robots.txt?
                            
                                How to add `nofollow, noindex` all pages in robots.txt?
                            
                                How to make a private URL?
                            
                                Robots.txt, how to allow access only to domain root, and no deeper? [closed]
                            
                                Facebook and Crawl-delay in Robots.txt?
                            
                                How can I serve robots.txt on an SPA using React with Firebase hosting?
                            
                                Is it possible to control the crawl speed by robots.txt?
                            
                                robots.txt in subdirectory
                            
                                Disallow or Noindex on Subdomain with robots.txt
                            
                                Web Crawler - Ignore Robots.txt file?
                            
                                Does robots.txt apply to subdomains?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With