Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Google Bot Crawling non-existent CSS file?

Google Bot Crawler is consistently trying to crawl CSS files that do not exist on my site in production.

It asks for:

http://www.mywebsite.com/assets/index-d45678283d4ab9905c3538184826e599.css

This exact file name does not exist in production (there is a slightly different file name in production).

However, the CSS file that it is requesting does exist in development at:

http://localhost:3000/assets/index-d45678283d4ab9905c3538184826e599.css

I'm not sure why it is asking for this file.

I use Capistrano (load "deploy/assets") to precompile my assets before deploying to production.

Right now, I just block this file in robots.txt but the css file that it requests changes after every deployment.

Why does GoogleBot want to crawl this file that doesn't exist in production site? How do I stop it?

like image 711
Hung Luu Avatar asked Oct 21 '22 07:10

Hung Luu


1 Answers

GoogleBot is probably seeing one of two things:

  • It sees that file on your site somewhere where it has been referenced incorrectly - I.E. old code builds. I would search your live site (bundled) for the file.
  • It remembers that file from a previous build, and is trying to check it for updates.

It puzzles me that it would not ignore it after a 404 response from your server, however, the inner workings of Google's software are a black box; there's no true way to tell why it does what it does.

That said, they offer the Webmaster Tools Panel that allows you to do some customization of their indexing etc.

like image 141
Christian Stewart Avatar answered Nov 15 '22 06:11

Christian Stewart