Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Jsoup simulate a button press?

Tags:

java

jsoup

Can you use Jsoup to submit a search to Google, but instead of sending your request via "Google Search" use "I'm Feeling Lucky"? I would like to capture the name of the site that would be returned.

I see lots of examples of submitting forms, but never a way to specify a specific button to perform the search or form submission.

If Jsoup won't work, what would?

like image 409
Brian Avatar asked Sep 22 '11 02:09

Brian


2 Answers

According to the HTML source of http://google.com the "I am feeling lucky" button has a name of btnI:

<input value="I'm Feeling Lucky" name="btnI" type="submit" onclick="..." />

So, just adding the btnI parameter to the query string should do (the value doesn't matter):

http://www.google.com/search?hl=en&btnI=1&q=your+search+term

So, this Jsoup should do:

String url = "http://www.google.com/search?hl=en&btnI=1&q=balusc";
Document document = Jsoup.connect(url).get();
System.out.println(document.title());

However, this gave a 403 (Forbidden) error.

Exception in thread "main" java.io.IOException: 403 error loading URL http://www.google.com/search?hl=en&btnI=1&q=balusc
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:387)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
    at test.Test.main(Test.java:17)

Perhaps Google was sniffing the user agent and discovering it to be Java. So, I changed it:

String url = "http://www.google.com/search?hl=en&btnI=1&q=balusc";
Document document = Jsoup.connect(url).userAgent("Mozilla").get();
System.out.println(document.title());

This yields (as expected):

The BalusC Code

The 403 is however an indication that Google isn't necessarily happy with bots like that. You might get (temporarily) IP-banned when you do this too often.

like image 61
BalusC Avatar answered Nov 06 '22 01:11

BalusC


I'd try HtmlUnit for navigating trough a site, and JSOUP for scraping

like image 20
pranahata Avatar answered Nov 05 '22 23:11

pranahata