Using a pushState
enabled page, normally you redirect SEO bots using the escaped_fragment
convention. You can read more about that here.
The convention assumes that you will be using a (#!
) hashbang prefix before all of your URI's on a single page application. SEO bots will escape these fragments by replacing the hashbang with it's own recognizable convention escaped_fragment
when making a page request.
//Your page
http://example.com/#!home
//Requested by bots as
http://example.com/?_escaped_fragment=home
This allows the site administrator to detect bots, and redirect them to a cached prerendered page.
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^(.*)$ https://s3.amazonaws.com/mybucket/$1 [P,QSA,L]
The problem is that the hashbang is getting phased out quickly with the widely adapted pushState
support. It's also really ugly and isn't very intuitive to a user.
So what if we used HTML5 mode where pushState guides the entire user application?
//Your index is using pushState
http://example.com/
//Your category is using pushState (not a folder)
http://example.com/category
//Your category/subcategory is using pushState
http://example.com/category/subcategory
Can rewrite rules guide bots to your cached version using this newer convention? Related but only accounts for index edge case. Google also has an article that suggests using an opt-in method for this single edge case using <meta name="fragment" content="!">
in the <head>
of the page. Again, this is for a single edge case. Here we are talking about handling every page as an opt-in senario.
http://example.com/?escaped_fragment=
http://example.com/category?escaped_fragment=
http://example.com/category/subcategory?escaped_fragment=
I'm thinking that the escaped_fragment
could still be used as an identifier for SEO bots, and that I could extract everything inbetween the the domain and this identifier to append to my bucket location like:
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=$
# (high level example I have no idea how to do this)
# extract "category/subcategory" == $2
# from http://example.com/category/subcategory?escaped_fragment=
RewriteRule ^(.*)$ https://s3.amazonaws.com/mybucket/$2 [P,QSA,L]
What's the best way to handle this?
Had a similar problem on a single page web app.
The only solution I found to this problem was effectively creating static versions of pages for the purpose of making something navigable by the Google (and other) bots.
You could do this yourself, but there are also services that do exactly this and create your static cache for you (and serve up the snapshots to the bots over their CDN).
I ended up using SEO4Ajax, although other similar services are available!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With