Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you search Google Programmatically Java API [closed]

Does anyone know if and how it is possible to search Google programmatically - especially if there is a Java API for it?

like image 384
Dan Avatar asked Sep 16 '10 14:09

Dan


People also ask

How do you search Google programmatically?

The following are few simple steps you will need to take, to setup your search API: Create your own custom search engine https://cse.google.com/cse/all and get Search engine ID from the settings panel. Google CSE ID. Under Basics settings section where you got your CSE ID , find and enable Search the entire web option.

How to programmatically Search Google in Java?

Your code needs some adjustments by the way... Indeed there is an API to search google programmatically. The API is called google custom search. For using this API, you will need an Google Developer API key and a cx key. A simple procedure for accessing google search from java program is explained in my blog.

How do I use the Google Search API?

The Google Search API is quite straightforward. It has two API endpoints, both supporting their variant of input parameter for returning the same search data. The “ GET get search” endpoint takes the search string as input and returns the search results in a JSON format array.

What is Google Custom search JSON API?

The Custom Search JSON API lets you develop websites and applications to retrieve and display search results from Google Custom Search programmatically. With this API, you can use RESTful requests to get either web search or image search results in JSON format. Custom Search JSON API can return results in JSON data format.

How do I search the whole web?

As you can see you will need to request an api key and setup an own search engine id, cx. Note that you can search the whole web by selecting "Search entire web" on basic tab settings during setup of cx, but results will not be exactly the same as a normal browser google search.


2 Answers

Some facts:

  1. Google offers a public search webservice API which returns JSON: http://ajax.googleapis.com/ajax/services/search/web. Documentation here

  2. Java offers java.net.URL and java.net.URLConnection to fire and handle HTTP requests.

  3. JSON can in Java be converted to a fullworthy Javabean object using an arbitrary Java JSON API. One of the best is Google Gson.

Now do the math:

public static void main(String[] args) throws Exception {     String google = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";     String search = "stackoverflow";     String charset = "UTF-8";          URL url = new URL(google + URLEncoder.encode(search, charset));     Reader reader = new InputStreamReader(url.openStream(), charset);     GoogleResults results = new Gson().fromJson(reader, GoogleResults.class);          // Show title and URL of 1st result.     System.out.println(results.getResponseData().getResults().get(0).getTitle());     System.out.println(results.getResponseData().getResults().get(0).getUrl()); } 

With this Javabean class representing the most important JSON data as returned by Google (it actually returns more data, but it's left up to you as an exercise to expand this Javabean code accordingly):

public class GoogleResults {      private ResponseData responseData;     public ResponseData getResponseData() { return responseData; }     public void setResponseData(ResponseData responseData) { this.responseData = responseData; }     public String toString() { return "ResponseData[" + responseData + "]"; }      static class ResponseData {         private List<Result> results;         public List<Result> getResults() { return results; }         public void setResults(List<Result> results) { this.results = results; }         public String toString() { return "Results[" + results + "]"; }     }      static class Result {         private String url;         private String title;         public String getUrl() { return url; }         public String getTitle() { return title; }         public void setUrl(String url) { this.url = url; }         public void setTitle(String title) { this.title = title; }         public String toString() { return "Result[url:" + url +",title:" + title + "]"; }     }  } 

###See also:

  • How to fire and handle HTTP requests using java.net.URLConnection
  • How to convert JSON to Java

Update since November 2010 (2 months after the above answer), the public search webservice has become deprecated (and the last day on which the service was offered was September 29, 2014). Your best bet is now querying http://www.google.com/search directly along with a honest user agent and then parse the result using a HTML parser. If you omit the user agent, then you get a 403 back. If you're lying in the user agent and simulate a web browser (e.g. Chrome or Firefox), then you get a way much larger HTML response back which is a waste of bandwidth and performance.

Here's a kickoff example using Jsoup as HTML parser:

String google = "http://www.google.com/search?q="; String search = "stackoverflow"; String charset = "UTF-8"; String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!  Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");  for (Element link : links) {     String title = link.text();     String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".     url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");          if (!url.startsWith("http")) {         continue; // Ads/news/etc.     }          System.out.println("Title: " + title);     System.out.println("URL: " + url); } 
like image 83
BalusC Avatar answered Sep 24 '22 08:09

BalusC


To search google using API you should use Google Custom Search, scraping web page is not allowed

In java you can use CustomSearch API Client Library for Java

The maven dependency is:

<dependency>     <groupId>com.google.apis</groupId>     <artifactId>google-api-services-customsearch</artifactId>     <version>v1-rev57-1.23.0</version> </dependency>  

Example code searching using Google CustomSearch API Client Library

public static void main(String[] args) throws GeneralSecurityException, IOException {      String searchQuery = "test"; //The query to search     String cx = "002845322276752338984:vxqzfa86nqc"; //Your search engine      //Instance Customsearch     Customsearch cs = new Customsearch.Builder(GoogleNetHttpTransport.newTrustedTransport(), JacksonFactory.getDefaultInstance(), null)                     .setApplicationName("MyApplication")                     .setGoogleClientRequestInitializer(new CustomsearchRequestInitializer("your api key"))                     .build();      //Set search parameter     Customsearch.Cse.List list = cs.cse().list(searchQuery).setCx(cx);       //Execute search     Search result = list.execute();     if (result.getItems()!=null){         for (Result ri : result.getItems()) {             //Get title, link, body etc. from search             System.out.println(ri.getTitle() + ", " + ri.getLink());         }     }  } 

As you can see you will need to request an api key and setup an own search engine id, cx.

Note that you can search the whole web by selecting "Search entire web" on basic tab settings during setup of cx, but results will not be exactly the same as a normal browser google search.

Currently (date of answer) you get 100 api calls per day for free, then google like to share your profit.

like image 34
Petter Friberg Avatar answered Sep 24 '22 08:09

Petter Friberg