Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Making GWT application crawlable by a search engine




I want to use the #! token to make my GWT application crawlable, as described here: http://code.google.com/web/ajaxcrawling/

There is a GWT sample app available online that uses this, for example: http://gwt.google.com/samples/Showcase/Showcase.html#!CwRadioButton

Will serve the following static webpage to the googlebot: http://gwt.google.com/samples/Showcase/Showcase.html?_escaped_fragment_=CwRadioButton

I want my GWT app to do something similar. In short, I'd like to serve a different flavor of the page whenever the _escaped_fragment_ parameter is found in the URL.

What should I modify in order for the server to serve something else (a static page, or a page dynamically generated through a headless browser like HTML Unit)? I'm guessing it could be the web.xml file, but I'm not sure.

(Note: I thought of checking the Showcase app provided with the GWT SDK, but unfortunately it doesn't seem to support serving static files on _escaped_fragment_ and it doesn't use the #! token..)

like image 844
Philippe Beaudoin Avatar asked Mar 12 '10 03:03

Philippe Beaudoin

2 Answers

If you want to use web.xml, then I think it won't work with a servlet-mapping, because the url-patterns ignore the get parameters. (Not 100% sure, if there is another way to make this possible.)

You could of course map Showcase.html to a servlet, and in that servlet decide what to do, based on the get parameter "_escaped_fragment_". But it's a little bit expensive to call a Servlet just to serve a static page for the majority of the requests (not too bad, but still. You could set cache headers, if you're sure that it doesn't change).

Or you could have an Apache or something in front of your server - but I understand, I wouldn't like to have to do that either. Maybe your JavaEE server (which one are you using BTW?) provides some mechanism for URL filtering before the request gets passed on to the web container - I'd like to know that, too!

like image 93
Chris Lercher Avatar answered Oct 22 '22 18:10

Chris Lercher

Found my answer! The Showcase sample supporting crawlable hyperlinks is in the following branch: http://code.google.com/p/google-web-toolkit/source/browse/branches/crawlability/samples/showcase/?r=7726

It defines a filter in the web.xml to redirect URLs with the _escaped_fragment_ token to the output of HTML Unit.

like image 22
Philippe Beaudoin Avatar answered Oct 22 '22 18:10

Philippe Beaudoin