Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i fix "Googlebot can't access your site" issue?

I just keep getting a message about

"Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. You can see more details about these errors in Webmaster Tools. "

I searched it and told me to add robots.txt on my site

And when I test the robots.txt on Google webmaster tools ,the robots.txt just cannot be fetched. enter image description here

I thought maybe robots.txt is blocked by my site ,but when I test it says allowed by GWT.

enter image description here

'http://momentcamofficial.com/robots.txt' And here is the content of the robots.txt : User-agent: * Disallow:

So why the robots.txt cannot be fetched by Google?What did I miss .... Can anybody help me ???

like image 543
Jason Avatar asked Aug 18 '14 03:08

Jason


2 Answers

I had a situation where Google Bot wasn't fetching yet I could see a valid robots.txt in my browser.

The problem turned out that I was redirecting my whole site (including robots.txt ) to https, and Google didn't seem to like that. So I excluded robots.txt from the redirect.

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteCond %{REQUEST_FILENAME} !robots\.txt
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

More info on my blog

like image 182
user57429 Avatar answered Sep 18 '22 06:09

user57429


Before Googlebot crawls your site, it accesses your robots.txt file to determine if your site is blocking Google from crawling any pages or URLs. If your robots.txt file exists but is unreachable (in other words, if it doesn’t return a 200 or 404 HTTP status code), we’ll postpone our crawl rather than risk crawling URLs that you do not want crawled. When this happens, Googlebot will return to your site and crawl it as soon as we can successfully access your robots.txt file.

As you know having robots.txt is optional so you don't need to make one, just make sure your host would send 200 or 404 http status only.

like image 39
Moein Nourmohammadi Avatar answered Sep 22 '22 06:09

Moein Nourmohammadi