Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make Ember app crawlable

I'm reading about google specifications about ajax crawling; I understood the concept but i need some more clarifications:

my URLs are all like this:

http://www.website.com/#!/eng/home
http://www.website.com/#!/eng/contacts
...

I have to provide the html snapshot at these addresses:

http://www.website.com/?_escaped_fragment_=/eng/home
http://www.website.com/?_escaped_fragment_=/eng/contacts
...

Is this correct? Or should I remove the "/" in the "escaped_fragment" URL (ex. http://www.website.com/?_escaped_fragment_=eng/home or something else?)

I generate the HTML snapshots with phantomjs, but then which one is the best way to provide these snapshots to the crawler? Using node js? Using htaccess rewrite rules?

like image 643
Cereal Killer Avatar asked Oct 30 '13 04:10

Cereal Killer


2 Answers

Ok, since i finally got rid of this, i would like to share the way i found;

first of all the HTML snapshot must be provided to the crawler at a specific URL where

?_escaped_fragment_=

is replacing

#!

So if you have:

http://www.website.com/#!/eng/home

your server must provide the snapshot at:

http://www.website.com/?_escaped_fragment_=/eng/home

If someone is interested in the method i use to generate the snapshot, i simply use a node module called judo (https://npmjs.org/package/judo); in order to use this you need to have on your server phantomjs (http://phantomjs.org/) and node (http://nodejs.org/); (more information about how to install phantomjs on the server: How can I setup & run PhantomJS on Ubuntu?)

Once you have everything installed you just need to write a js file using judo (ex. judo.js) (following the doc page that i've linked before you will be ready in 5 mins); upload the file on the server and execute it with node to create the snapshots and the sitemap;

after this, you need to serve the google's crawler with the HTML snapshots when he ask for ?_escaped_fragment_= URLs; the simplest way in my opinion is by .htaccess file; in particular you need just 3 lines of code, that in my case are:

RewriteEngine On
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=/(.*)$
RewriteRule ^$ /seo/snapshots/%1\.html [L]

(since in my judo.js file creates the snapshots in /seo/snapshots directory)

Finally, you can check that everything works using the "fetch as google" option in the google webmaster tools' panel; if you did all correctly, you will see that the result is the HTML snapshot...

like image 162
Cereal Killer Avatar answered Oct 07 '22 00:10

Cereal Killer


Usually i don't answer SO posts by suggesting a paid service, but in this case think you should really consider using BromBone - http://www.emberjsseo.com

like image 43
Mike Grassotti Avatar answered Oct 06 '22 23:10

Mike Grassotti