I have my blog (you can see it if you want, from my profile), and it's fresh, as well as google robots parsing results are.
The results were alarming to me. Apparently the most common 2 words on my site are "rss" and "feed", because I use text for links like "Comments RSS", "Post Feed", etc. These 2 words will be present in every post, while other words will be more rare.
Is there a way to make these links disappear from Google's parsing? I don't want technical links getting indexed. I only want content, titles, descriptions to get indexed. I am looking for something other than replacing this text with images.
I found some old discussions on Google, back from 2007 (I think in 3 years many things could have changed, hopefully this too)
This question is not about robots.txt and how to make Google ignore pages. It is about making it ignore small parts of the page, or transforming the parts in such a way that it will be seen by humans and invisible to robots.
The “Disallow: /” part means that it applies to your entire website. In effect, this will tell all robots and web crawlers that they are not allowed to access or crawl your site.
Go to the View Settings section under the Admin section in your Google Analytics view, and check the 'Exclude all hits from known bots and spiders' box. You will then start filtering out bot and spider traffic which will make reporting on human visits and activity much clearer.
These bots "crawl" webpages and index the content so that it can show up in search engine results. A robots. txt file helps manage the activities of these web crawlers so that they don't overtax the web server hosting the website, or index pages that aren't meant for public view.
There is a simple way to tell google to not index parts of your documents, that is using googleon
and googleoff
:
<p>This is normal (X)HTML content that will be indexed by Google.</p> <!--googleoff: index--> <p>This (X)HTML content will NOT be indexed by Google.</p> <!--googleon: index-->
In this example, the second paragraph will not be indexed by Google. Notice the “index
” parameter, which may be set to any of the following:
index
— content surrounded by “googleoff: index
” will not be indexed by Google
anchor
— anchor text for any links within a “googleoff: anchor
” area will not be associated with the target page
snippet
— content surrounded by “googleoff: snippet
” will not be used to create snippets for search results
all
— content surrounded by “googleoff: all
” are treated with all
source
Google ignores HTML tags which have data-nosnippet
:
<p> This text can be included in a snippet <span data-nosnippet>and this part would not be shown</span>. </p>
Source: Special tags that Google understands - Inline directives
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With