Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Java to pull data from a webpage?

Tags:

java

I'm attempting to make my first program in Java. The goal is to write a program that browses to a website and downloads a file for me. However, I don't know how to use Java to interact with the internet. Can anyone tell me what topics to look up/read about or recommend some good resources?

like image 868
user658168 Avatar asked May 28 '11 01:05

user658168


People also ask

Can Java be used for web scraping?

Yes. There are many powerful Java libraries used for web scraping. Two such examples are JSoup and HtmlUnit. These libraries help you connect to a web page and offer many methods to extract the desired information.

How do you call a webpage in Java?

Here's a basic example: URL url = new URL("http://www.stackoverflow.com"); URLConnection urlConnection = url. openConnection(); InputStream result = urlConnection. getInputStream(); BufferedReader reader = new BufferedReader(new InputStreamReader(result)); String line = null; while ((line = reader.


1 Answers

The simplest solution (without depending on any third-party library or platform) is to create a URL instance pointing to the web page / link you want to download, and read the content using streams.

For example:

    import java.io.BufferedReader;     import java.io.IOException;     import java.io.InputStream;     import java.io.InputStreamReader;     import java.net.URL;     import java.net.URLConnection;               public class DownloadPage {              public static void main(String[] args) throws IOException {                          // Make a URL to the web page             URL url = new URL("http://stackoverflow.com/questions/6159118/using-java-to-pull-data-from-a-webpage");                          // Get the input stream through URL Connection             URLConnection con = url.openConnection();             InputStream is = con.getInputStream();                          // Once you have the Input Stream, it's just plain old Java IO stuff.                          // For this case, since you are interested in getting plain-text web page             // I'll use a reader and output the text content to System.out.                          // For binary content, it's better to directly read the bytes from stream and write             // to the target file.                                    try(BufferedReader br = new BufferedReader(new InputStreamReader(is))) {                 String line = null;                              // read each line and write to System.out                 while ((line = br.readLine()) != null) {                     System.out.println(line);                 }             }         }     } 

Hope this helps.

like image 140
Yohan Liyanage Avatar answered Oct 15 '22 13:10

Yohan Liyanage