I want to use HTTP GET and POST commands to retrieve URLs from a website and parse the HTML. How do I do this?
Search for the page. In search results, click the title of the page. At the top of your browser, click the address bar to select the entire URL. Copy.
Constructors of the URL classURL(URL context, String spec): Creates a URL object by parsing the given spec in the given context. URL(String protocol, String host, int port, String file, URLStreamHandler handler): Creates a URL object from the specified protocol, host, port number, file, and handler.
You can use HttpURLConnection in combination with URL.
URL url = new URL("http://example.com");
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
InputStream stream = connection.getInputStream();
// read the contents using an InputStreamReader
The easiest way to do a GET is to use the built in java.net.URL. However, as mentioned, httpclient is the proper way to go, as it will allow you among others to handle redirects.
For parsing the html, you can use html parser.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With