Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why google webmaster tools don't see the static version of my site but instead the template for the dynamic one?

I have added the spiderable package package to my Meteor app, and the html version of the page is returned when making requests with ?_escaped_fragment_= in the url, but I'm unable to get Google to crawl the site.

Details

When using Fetch as Google in Google Webmaster Tools and requesting the root page "http://example.com/" the page return is the javascript version; some thing like:

HTTP/1.1 200 OK
content-type: text/html; charset=utf-8
date: Fri, 30 Nov 2012 05:39:36 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <script type="text/javascript">
      __meteor_runtime_config__ = {"ROOT_URL":"http://www.example.com","DEFAULT_DDP_ENDPOINT":"https://www-example-com-ddp.meteor.com/"};
    </script>
    <script type="text/javascript" src="/13cf3d21ce1c4a88407ca5f3c250f186ab1738f9.js"></script>
    <meta name="fragment" content="!">
    <title>example.com</title>
  </head>
<body>
</body>
</html>

If instead, I request http://example.com/?_escaped_fragment_= the html version is returned:

HTTP/1.1 200 OK
content-type: text/html; charset=UTF-8
date: Wed, 05 Dec 2012 02:44:09 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <title>example.com</title>
    <meta name="viewport" content="initial-scale=1.0">
  </head>
  <body>
    <ul>
      <li><a href="/">Home</a></li>
      <li><a href="/one">One</a></li>
      <li><a href="/two">Two</a></li>
    </ul>
  </body>
</html>

Questions

  • How do you tell Google to add the ?_escaped_fragment_= to the url, so that it renders the html version?

  • Will Google still add the ?_escaped_fragment_= to the url, if the urls do not have hashbangs (!#)? i.e. /home, /products/1 instead of /!#home, /!#products/1?

  • How do you make Google follow the linked pages? And append the ?_escaped_fragment_=? All of the js version of the page have <meta name="fragment" content="!"> in the header. I assumed that was all that was required.

It seems that the simplest solution would be update the update the spiderable package to return the html version to Google Bot, instead of requiring ?_escaped_fragment_=, but if this is working for others, I'm curious, as to what I'm doing wrong.

Additional Info

Meteor's spiderable package is a temporary solution to allow web search engines to index Meteor applications.

According to the source it does a few things:

  1. It adds the following tag to the head section of js version of the page:

    <head><meta name="fragment" content="!"></head>

  2. Using PhantomJS it parses the javascript application and returns an html version when either of the following conditions are met:

    a. The requesting user agent is "facebookexternalhit"

    b. The requested url contains the string ?_escaped_fragment_=

like image 209
Kyle Finley Avatar asked Dec 06 '12 23:12

Kyle Finley


People also ask

Are dynamic URLs better for SEO than static URLs?

While static URLs might have a slight advantage in terms of clickthrough rates because users can easily read the urls, the decision to use database-driven websites does not imply a significant disadvantage in terms of indexing and ranking. Providing search engines with dynamic URLs should be favored over hiding parameters to make them look static.

Are webmasters’ beliefs about URL structure up-to-date?

Chatting with webmasters often reveals widespread beliefs that might have been accurate in the past, but are not necessarily up-to-date any more. This was the case when we recently talked to a couple of friends about the structure of a URL.

How do I search for static URLs on Google?

You can search for static URLs on Google by typing filetype:htm in the search field. Updating these kinds of pages can be time consuming, especially if the amount of information grows quickly, since every single page has to be hard-coded.

How to fix dynamic URLs that are not working?

So the best solution is often to keep your dynamic URLs as they are. Or, if you remove irrelevant parameters, bear in mind to leave the URL dynamic as the above example of a rewritten URL shows: www.example.com/article/bin/answer.foo?language=en&answer;=3


1 Answers

I believe this to be a "Google Webmaster Tools" bug.

It seems that Google is indeed crawling the site -- the pages are showing up in Google results. Yet, Google Webmaster tools still list total indexed pages as 1. Bing still isn't crawling the page, however.

EDIT: It Google Webmaster Tools the pages are listed as

Not selected: Pages that are not indexed because they are substantially similar to other pages, or that have been redirected to another URL. More information.

EDIT2: In response to Jonatan's question:

Will Google still add the ?_escaped_fragment_= to the url, if the urls do not have hashbangs (!#)?

Yes. My application does not use hashbangs (!#) in the urls. And Google bot still appends ?_escaped_fragment_= when crawling. Here's an example of the logs:

INFO HIT /url/2/01 66.249.72.42
INFO HIT /url/2/01?_escaped_fragment_= 66.249.72.142
INFO HIT /url/2/01 108.162.222.82
INFO HIT /url/2/01?_escaped_fragment_= 108.162.222.82
INFO HIT /url/2/05 108.162.222.82
INFO HIT /url/2/05?_escaped_fragment_= 108.162.222.214

It appear that Google bot will try the url with and without the ?_escaped_fragment_=

like image 122
Kyle Finley Avatar answered Oct 07 '22 17:10

Kyle Finley