According to the Google custom search API's docs: http://code.google.com/apis/customsearch/docs/start.html#sites there is a limit of up to 5000 sites that you can search. This is pretty lame. Is there any way around this so that I can search the entire web using Google's results?
Also if you include a bunch of url patterns that matches greater than 5000 websites, how would the API pick and choose which sites to include and which to exclude?
This is for a custom search, not a normal Google search. For example, if you owned abc.com and acme.com, you could set up a custom search on those two domains for your customers. That way, they could search your sites for information. The 5,000-site limit is actually huge. I'm not sure I can think of an application that would use that many specified sites.
I think what you are looking for is the Google Web Search API, which searched all of Google. Unfortunately, that is now depreciated. (reference: http://code.google.com/apis/websearch/) You can still use the old API, but it is a risk because Google reserves the rights to turn it off at any time. They will also limit the number of searches you perform per day (although I can't find a specific number for that limit). Here is a link to their terms: http://code.google.com/apis/websearch/terms.html
I would recommend looking at an API from another search engine if you really want to integrate it directly into your code. A different suggestion would be to put your search information behind an interface and code it to Google for now. Then if they turn it off or something better comes out, you could change just the search code to point to the newest and best API.
Google Custom Search is actually capable of searching the entire web, although the setting is not obvious. See "Search the entire web".
The other problems you are likely to run into are:
Sadly, "upgrading" to Google Site Search eliminates problem #2 at the expense of being able to search the entire web.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With