Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

<noindex> tag for Google

Tags:

I would like to tell Google not to index certain parts of the page. In Yandex (russian SE) there's a very useful tag called <noindex>. How can it be done with Google?

like image 483
teslasimus Avatar asked Mar 28 '13 15:03

teslasimus


People also ask

How can I add noindex to Google?

You can prevent a page or other resource from appearing in Google Search by including a noindex meta tag or header in the HTTP response. When Googlebot next crawls that page and sees the tag or header, Google will drop that page entirely from Google Search results, regardless of whether other sites link to it.

Does Google crawl noindex?

Google will still crawl your web site and web pages with the noindex tag on them. It needs to, in order to know what not to index. You can use robots. txt, nofollow, and other means to try to slow or prevent what Google picks up on but not the noindex tag.

What is noindex in SEO?

“Noindex” Meta Robots Tags Typically webmasters will use the “noindex” directive to prevent content from being indexed that is not intended for search engines. Some common use cases for “noindex” directives: Pages containing sensitive information. Shopping cart or checkout pages on an eCommerce website.

What is noindex in Google console?

If you submitted a page for Google to index and received the Submitted URL Marked 'noindex' error message, it means that Google has identified that your page should not be indexed and displayed in search results.


1 Answers

According to Wikipedia1, there are some rules some spiders follow:

<!--googleoff: all--> This should not be indexed by Google. Though its main spider, Googlebot, might ignore that hint. <!--googleon: all-->  <div class="robots-nocontent">Yahoo bots won't index this.</div>  <noindex>Yandex bots ignore this text.</noindex> <!--noindex-->They will ignore this, too.<!--/noindex--> 

Unfortunately, they could not agree on a single standard it seems – and to my knowledge, there's nothing to keep all spiders off...

The googleoff: comment seems to support different options, though I'm not sure where there's a complete list. There's at least:

  • all: completely ignore the block
  • index: content doesn't go into Google's index
  • anchor: anchor text for links will not be associated with the target page
  • snippet: text will not be used to create snippets for search results

Note as well that (at least for Google) this will only affect the search index, not the page ranking etc. Furthermore, as Stephen Ostermiller correctly pointed out in his comment below, googleon and googleoff only work with the Google search appliance and have no effect on normal Googlebot, unfortunately.

There's also an article on the Yahoo part2 (and an article describing that Yandex also honors <noindex>6). On the googleoff: part, also see this answer, and the article I took most of the related information from.3


Additionally, Google Webmaster Tools recommend using the rel=nofollow attribute4 for specific links (e.g. ads or links to pages not accessible/useful to the bots, such as login/signup). That means, the HTML a rel Attribute should be honored by the Google bots – though that's mainly related to page rank, not to the search index itself. Unfortunately, it seems there's no rel=noindex5,7. I'm also not sure if this attribute could be used for other elements as well (e.g. <DIV REL="noindex">); but unless crawlers honor "noindex", that wouldn't make sense either.


Further references:

  • How to Noindex parts of a web page?
  • Excluding crawler from sections of pages (Spiderline crawler; you see, other crawlers might use other proprietary markers (see also AddSearch crawler). I wish they simply made REL="noindex" a standard instead to be used with any HTML tag such as DIV/SPAN/P/A!)
  • Preventing Google from indexing the contents of a div by reversing the string
  • Methods for preventing search engines from indexing irrelevant content on a page

1Wikipedia: Noindex
2Which Sections of Your Web Pages Might Search Engines Ignore?
3Tell Google to Not Index Certain Parts of Your Page
4Use rel="nofollow" for specific links
5Is it a good idea to use <a href=“http://name.com” rel=“noindex, nofollow”>name</a>?
6Using HTML tags — Yandex.Help. Webmaster
7existing REL values

like image 82
Izzy Avatar answered Oct 12 '22 18:10

Izzy