Why google finds a page excluded by robots.txt?

Question

i'm using robots.txt to exclude some pages from spiders.

User-agent: * 
Disallow: /track.php

When i search something refeered to this page, google says: "A description for this result is not available because of this site's robots.txt – learn more."

It means that the robots.txt is working.. but why the link to the page is still found by the spider? I'd like to have no link to the 'track.php' page... how i should setup the robots.txt? (or something like .htaccess and so on..?)

Jim Mischel · Accepted Answer

Here's what happened:

Googlebot saw, on some other page, a link to track.php. Let's call that page "source.html".
Googlebot tried to visit your track.php file.
Your robots.txt told Googlebot not to read the file.

So Google knows that source.html links to track.php, but it doesn't know what track.php contains. You didn't tell Google not to index track.php; you told Googlebot not to read and index the data inside track.php.

As Google's documentation says:

While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.

There's not a lot you can do about this. For your own pages, you can use the x-robots-tag or noindex meta tag as described in that documentation. That will prevent Googlebot from indexing the URL if it finds a link in your pages. But if some page that you don't control links to that track.php file, then Google is quite likely to index it.

Why google finds a page excluded by robots.txt?

Tags:

.htaccess

web-crawler

robots.txt

Alberto Fecchi

1 Answers

Jim Mischel

Recent Activity

Donate For Us

Why google finds a page excluded by robots.txt?

Tags:

.htaccess

web-crawler

robots.txt

Alberto Fecchi

1 Answers

Jim Mischel

Related questions

Recent Activity

Donate For Us