Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent google from indexing <script type="application/json"> content

I have discovered through Google's webmaster tools that google is crawling paths that look like links embedded in json in a <script type="application/json"> tag. This json is later parsed and used on the client side.

The problem is that the json contains paths that are not valid links, and Google is treating them as links, and so it is trying to crawl them and getting a steadily increasing amount of 404s, and thus increasing unnecessary crawler traffic.

What can I do to prevent google from attempting to crawl these paths? I can add some patterns to robots.txt, but I want to ensure that google is ignoring the contents of the script tag entirely, and not trying to parse it for paths that look like links.

like image 438
undefined Avatar asked Apr 10 '26 14:04

undefined


1 Answers

Try this markup:

<!--googleoff: all-->
<script type="application/json">
  // your json content here
</script>
<!--googleon: all>

As written in this post.

Plus few more articles:
Preparing for a Crawl
FAQ - How do i use the googleon/googleoff Tags?

PS:

For even more secure way: when possible,
try to use content, generated "on-fly" such as ajax loading.

like image 95
Sergey Sklyar Avatar answered Apr 12 '26 11:04

Sergey Sklyar