Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Allow only Google CSE and disallow Google standard search in ROBOTS.txt

I have a site that I am using a Google Custom Search Engine on. I want Google CSE to crawl my site but I want it to stay out of the results of a regular Google search. I put this in my robots.txt file hoping that google CSE bots would ignore it since I specified the pages I wanted Google CSE to crawl in the settings

User-agent: *
Disallow: /

I guess the Google CSE bots also have to obey robots.txt. So is there a way to get my page to stay out of search engine searches but for Google CSE to still be able to index it? TIA!

like image 853
Bender Avatar asked Jan 21 '13 16:01

Bender


People also ask

Should I disallow search in robots txt?

Using the robots. txt file you can prevent search engines from accessing certain parts of your website , prevent duplicate content and give search engines helpful tips on how they can crawl your website more efficiently .

What is allow and disallow in robots txt?

In practice, robots. txt files indicate whether certain user agents (web-crawling software) can or cannot crawl parts of a website. These crawl instructions are specified by “disallowing” or “allowing” the behavior of certain (or all) user agents.

How do I restrict Googlebot?

Prevent specific articles on your site from appearing in Google News and Google Search, block access to Googlebot using the following meta tag: <meta name="googlebot" content="noindex, nofollow">.


1 Answers

There is no solution to this question that fits with what you would like. I too have the same situation where I need custom search only. Unfortunately Google's list of crawlers doesn't show a specific bot for google custom search. Blocking Googlebot will both kill native search and custom search.

like image 69
Giles Wells Avatar answered Oct 02 '22 18:10

Giles Wells