Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Methods for preventing search engines from indexing irrelevant content on a page

I'm looking for ways to prevent indexing of parts of a page. Specifically, comments on a page, since they weigh up entries a lot based on what users have written. This makes a Google search on the page return lots of irrelevant pages.

Here are the options I'm considering so far:

1) Load comments using JavaScript to prevent search engines from seeing them.

2) Use user agent sniffing to simply not output comments for crawlers.

3) Use search engine-specific markup to hide parts of the page. This solution seems quirky at best, though. Allegedly, this can be done to prevent Yahoo! indexing specific content:

<div class="robots-nocontent">
This content will not be indexed!
</div>

Which is a very ugly way to do it. I read about a Google solution that looks better, but I believe it only works with Google Search Appliance (can someone confirm this?):

<!--googleoff: all-->
This content will not be indexed!
<!--googleon: all-->

Does anyone have other methods to recommend? Which of the three above would be the best way to go? Personally, I'm leaning towards #2 since while it might not work for all search engines, it's easy to target the biggest ones. And it has no side-effect on users, unless they're deliberately trying to impersonate a web crawler.

like image 376
Blixt Avatar asked Dec 29 '09 09:12

Blixt


1 Answers

I would go with your JavaScript option. It has two advantages:

1) bots don't see it 2) it would speed up your page load time (load the comments asynchronously and unobtrusively, e.g. via jQuery) ... page load times have a much underrated positive effect on your search rankings

like image 149
autonomatt Avatar answered Oct 12 '22 19:10

autonomatt