Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting Data from Internet in Java

Tags:

java

html

web

jnlp

I thought of making the following application for my college project in java. I know core java. I want to know what should i read "specifically" for this project as there is less time:

It will have an interface to put your query. This string would go as a query to internet search engines and with the help of search engine find the data (the first web page that we see (that is data for my application for this time. :) )).
I do not want to display the data. I just want the HTML file or the source code of the generated web page. Is it sounding like Common Getaway Interface? I do not know about this.

But i think it for the same purpose. If it is this. please guide me to know how to implement this.
Whatever please specify

  • Problem 1 : What should i read ? Any direct help at this point is not my intention. I want to implement it myself.
  • Problem 2 : Is connecting to internet requires some jnlp knowledge too.

for eg. as on google we search something it shows us the links of the websites. I can see the source code of this generated web page. I just want this page for my application to work on.

EDIT: I do not want to rely on google only or any particular web server. I want to decide that by my application.
Please also refer to my problem 2.

As i discovered that we have Terms of Conditions for websites should i try to make my crawler. Would then my application not breaking the rules . Well its important for me.

like image 797
Ashish Negi Avatar asked Jan 21 '26 01:01

Ashish Negi


2 Answers

Ashish, Here what I would recommend.

  1. Learn the basics of JSON from these links (Introduction ,lib download)
  2. Then look at the Google Web Search JSON API here.
  3. Learn how to GET the data from servers using HttpClient library here.
  4. Now what you have to do is, fire a get request for the search, read the JSON response, parse the response using the JSON lib from #1 and you have the search results.
  5. Most of the search engines (Bing etc) offer Jason/REST apis so you can do the same for other search engines.

Note: Jason APIs are normally used from JavaScritps on the UI side but since its very easy and quick to learn, I suggested you that. You can also explore (if time permits) the XML based APIs also.

like image 162
Santosh Avatar answered Jan 23 '26 15:01

Santosh


URL url = new URL("http://fooooo.com");
in = new BufferedReader(new InputStreamReader(url.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
  {
    System.out.println(inputLine);
  }

Should be enough to get you started .

And yes , do check if you are not violating the usage terms of a website . Search Engines dont really like you trying to access them via a program .

Many , Including Google , has APIs specifically designed for this purpose.

like image 21
amal Avatar answered Jan 23 '26 15:01

amal



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!