i'm trying to get an entire WebPage through a URLConnection.
What's the most efficient way to do this?
I'm doing this already:
URL url = new URL("http://www.google.com/");
URLConnection connection;
connection = url.openConnection();
InputStream in = connection.getInputStream();
BufferedReader bf = new BufferedReader(new InputStreamReader(in));
StringBuffer html = new StringBuffer();
String line = bf.readLine();
while(line!=null){
html.append(line);
line = bf.readLine();
}
bf.close();
html has the entire HTML page.
I think this is the best way. The size of the page is fixed ("it is what it is"), so you can't improve on memory. Perhaps you can compress the contents once you have them, but they aren't very useful in that form. I would imagine that eventually you'll want to parse the HTML into a DOM tree.
Anything you do to parallelize the reading would overly complicate the solution.
I'd recommend using a StringBuilder with a default size of 2048 or 4096.
Why are you thinking that the code you posted isn't sufficient? You sound like you're guilty of premature optimization.
Run with what you have and sleep at night.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With