Can you use Jsoup to submit a search to Google, but instead of sending your request via "Google Search" use "I'm Feeling Lucky"? I would like to capture the name of the site that would be returned.
I see lots of examples of submitting forms, but never a way to specify a specific button to perform the search or form submission.
If Jsoup won't work, what would?
According to the HTML source of http://google.com the "I am feeling lucky" button has a name of btnI
:
<input value="I'm Feeling Lucky" name="btnI" type="submit" onclick="..." />
So, just adding the btnI
parameter to the query string should do (the value doesn't matter):
http://www.google.com/search?hl=en&btnI=1&q=your+search+term
So, this Jsoup should do:
String url = "http://www.google.com/search?hl=en&btnI=1&q=balusc";
Document document = Jsoup.connect(url).get();
System.out.println(document.title());
However, this gave a 403 (Forbidden) error.
Exception in thread "main" java.io.IOException: 403 error loading URL http://www.google.com/search?hl=en&btnI=1&q=balusc
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:387)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
at test.Test.main(Test.java:17)
Perhaps Google was sniffing the user agent and discovering it to be Java. So, I changed it:
String url = "http://www.google.com/search?hl=en&btnI=1&q=balusc";
Document document = Jsoup.connect(url).userAgent("Mozilla").get();
System.out.println(document.title());
This yields (as expected):
The BalusC Code
The 403 is however an indication that Google isn't necessarily happy with bots like that. You might get (temporarily) IP-banned when you do this too often.
I'd try HtmlUnit for navigating trough a site, and JSOUP for scraping
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With