Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google not crawling links in AngularJS application

I have an AngularJS application that is injected into 3rd party sites. It injects dynamic content into a div on the 3rd party page. Google is successfully indexing this dynamic content but does not appear to be crawling links within the dynamic content. The links would look something like this in the dynamic content:

<a href="http://www.example.com/support?title=Example Title&titleId=12345">Link Here</a>

I'm using query parameters for the links rather than an actual url structure like:

http://www.example.com/support/title/Example Title/titleId/12345

I have to use the query parameters as I don't want the 3rd party site to have to change their web server configuration to redirect unfound URLs.

When the link is clicked I use the $locationService to update the url in the browser and then my angular application responds accordingly. Mainly it shows just the relevant content based on the query params, sets the page title and meta description.

Many of the articles I have read use the route provider in angularJS and templates but I'm not sure why this would make a difference to the crawler?

I have read that google should view urls with query parameters as separate pages so I don't believe that should be the issue: https://webmasters.googleblog.com/2008/09/dynamic-urls-vs-static-urls.html

The only things I have not tried are 1. providing a sitemap with the urls that have the query parameters and 2. adding static links from other pages to the dynamic links to help google discover those pages.

Any help, ideas or insights would be greatly appreciated.

like image 404
AquaLunger Avatar asked Oct 12 '16 22:10

AquaLunger


1 Answers

This happens because google crawlers are not able to get the static html from your url since your pages are dynamically rendered with Javascript, you can achieve what you want using the following :

Since #! is deprecated, You can tell google that your pages are rendered with javascript by using the following tag in your header

<meta name="fragment" content="!">

On finding the above tag google bots will request your urls with the _escaped_fragment_ query parameter from your server like

http://www.example.com/?_escaped_fragment_=/support?title=Example Title&titleId=12345  

Then you need to rebuild your original url from the _escaped_fragment_ on your server and it will look like this again

http://www.example.com/support?title=Example Title&titleId=12345  

Then you will need to serve the static HTML to the crawler for that url. You can do that using a headless browser to access the url. Phantom.js is a good option to render your page using the javascript and then give the contents into a file to create a HTML snapshot of your page. You can save the snapshot as well on your server for further crawling, so when google bots visit can you can directly serve the snapshot instead of re-rendering the page again.

like image 183
Himanshu Mittal Avatar answered Oct 11 '22 23:10

Himanshu Mittal