I'm looking for ways to prevent indexing of parts of a page. Specifically, comments on a page, since they weigh up entries a lot based on what users have written. This makes a Google search on the page return lots of irrelevant pages.
Here are the options I'm considering so far:
1) Load comments using JavaScript to prevent search engines from seeing them.
2) Use user agent sniffing to simply not output comments for crawlers.
3) Use search engine-specific markup to hide parts of the page. This solution seems quirky at best, though. Allegedly, this can be done to prevent Yahoo! indexing specific content:
<div class="robots-nocontent">
This content will not be indexed!
</div>
Which is a very ugly way to do it. I read about a Google solution that looks better, but I believe it only works with Google Search Appliance (can someone confirm this?):
<!--googleoff: all-->
This content will not be indexed!
<!--googleon: all-->
Does anyone have other methods to recommend? Which of the three above would be the best way to go? Personally, I'm leaning towards #2 since while it might not work for all search engines, it's easy to target the biggest ones. And it has no side-effect on users, unless they're deliberately trying to impersonate a web crawler.
I would go with your JavaScript option. It has two advantages:
1) bots don't see it 2) it would speed up your page load time (load the comments asynchronously and unobtrusively, e.g. via jQuery) ... page load times have a much underrated positive effect on your search rankings
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With