Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do Google's crawlers interpret Javascript? What if I load a page through AJAX? [closed]

Tags:

web-crawler

When a user enters my page, I have to make another AJAX call...to load data inside a div. That's just how my application works.

The problem is...when I view the source of this code, it does not contain the source of that AJAX. Of course, when I do wget URL ...it also does not show the AJAX HTML. Makes sense.

But what about Google? Will Google be able to crawl the content, as if it's a browser? How do I allow Google to crawl my page just like a user would see it?

like image 923
TIMEX Avatar asked Jan 14 '10 02:01

TIMEX


4 Answers

Despite the answers above, apparently it does interpret JavaScript, to an extent, according to Matt Cutts:

"For a while, we were scanning within JavaScript, and we were looking for links. Google has gotten smarter about JavaScript and can execute some JavaScript. I wouldn't say that we execute all JavaScript, so there are some conditions in which we don't execute JavaScript. Certainly there are some common, well-known JavaScript things like Google Analytics, which you wouldn't even want to execute because you wouldn't want to try to generate phantom visits from Googlebot into your Google Analytics".

(Why answer an answered question? Mostly because I just saw it because of a duplicate question posted today, and didn't see this info here.)

like image 154
T.J. Crowder Avatar answered Sep 27 '22 17:09

T.J. Crowder


Actually... Google does have a solution for crawling Ajax applications...

http://code.google.com/web/ajaxcrawling/docs/getting-started.html

like image 38
philfreo Avatar answered Sep 27 '22 17:09

philfreo


Updated: From the answer to this question about "Ajax generated content, crawling and black listing" I found this document about the way Google crawls AJAX requests which is part of a collection of documents about Making AJAX Applications Crawlable.

In short, it means you need to use <a href="#!data">...</a> rather than <a href="#data">...</a> and then supply a real server-side answer to the URL path/to/path?_escaped_fragment_=data.

Also consider a <link/> tag to supply crawlers with a hint to SEO-friendly content. <link rel="canonical"/>, which this article explains a bit, is a good candidate

Note: I took the answer from: https://stackoverflow.com/questions/10006825/search-engine-misunderstanting/10006925#comment12792862_10006925 because it seems I can't delete mine here.

like image 41
jldupont Avatar answered Sep 27 '22 19:09

jldupont


What I do in this situation is always initially populate the page with content based upon the default parameters of whatever the Ajax call is doing. Then I only use the ajax javascript to do updates to the page.

like image 42
Craig Avatar answered Sep 27 '22 17:09

Craig