Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I retrieve a URL from a web site using Java?

I want to use HTTP GET and POST commands to retrieve URLs from a website and parse the HTML. How do I do this?

like image 740
Johnny Maelstrom Avatar asked Dec 11 '08 14:12

Johnny Maelstrom


People also ask

How do I retrieve my URL?

Search for the page. In search results, click the title of the page. At the top of your browser, click the address bar to select the entire URL. Copy.

How can I get URL in Java?

Constructors of the URL classURL(URL context, String spec): Creates a URL object by parsing the given spec in the given context. URL(String protocol, String host, int port, String file, URLStreamHandler handler): Creates a URL object from the specified protocol, host, port number, file, and handler.


2 Answers

You can use HttpURLConnection in combination with URL.

URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();

InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader
like image 69
Rob Hruska Avatar answered Sep 30 '22 00:09

Rob Hruska


The easiest way to do a GET is to use the built in java.net.URL. However, as mentioned, httpclient is the proper way to go, as it will allow you among others to handle redirects.

For parsing the html, you can use html parser.

like image 43
kgiannakakis Avatar answered Sep 30 '22 00:09

kgiannakakis