Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SEO for Backbone.js app on apache server - phantom.js and node.js?

I'm working on a backbone.js/marionette website that needs to be search engine optimized (SEO). We're using java/spring RESTful backend and an Apache 2.2 webserver. I'm currently in the process of implementing pushstate in our app while it's still in the early stages.

What I've come up with so far as a solution:

  • For normal users with javascript enabled browsers, use a purely client-side backbone implementation.
  • Use Apache's mod_rewrite to route all paths to our index.html page with the path intact such that backbone.js returns the correct page, and the url retains its form. I have this much working correctly (minus one bug).
  • Sniff for bots/crawlers using Apache's httpd.conf file, and create rewrite rules to reroute bots to our node.js server.
  • Generate html/content using phantomjs and return that to the webcrawler.

We don't need the site to be fully functional for the bot, but it must return the correct content. We are using mustache templates, but we want a DRY site and feel that any sort of java template rendering would get incredibly messy as the site grows. We hope to have this site around for many years, and are not trying to hook into a ton of 3rd party libraries (at least not many more than we already are).

Does anyone have any experience or advice on this topic? From my research, others are a little wary, specifically this related question. I'm somewhat concerned about if bots "click" in javascript vs. performing get requests. Thoughts and advice?

Thanks very much in advance.

like image 551
Andrew Avatar asked Mar 08 '13 18:03

Andrew


1 Answers

Very bad idea, sorry I'm so blunt.

What you want to do is the following:

If I hit http://yoursite.com/path/to/resource via a straight up HTTP request then your server should serve me down the html for that resource page and then if you WANT, you can use javascript at that point to "init" up the single page app aspect. From there if I am navigating via AJAX and backbone routes its all good from there on out. If I then copy a url, close my browser and then paste it in when I reopen, I should expect to see the same html.

This approach has proven to be the best approach not only for SEO, but also for conceptually designing your system, as well as ensuring you "work" for everybody not just fast JS enabled browsers.

What you want to avoid at all costs is trying to trick the crawlers and feeding them different content than what a user would see... that is a recipe for a blacklist.

In summary, build your site so that if you hit the urls via http you get the full html, and if you hit the same url via single page app mode ajax, you get the partial you need to keep it all in sync.... better architecture, less SEO blacklists!

like image 122
Nick Sharp Avatar answered Oct 23 '22 19:10

Nick Sharp