Without the use of any external library, what is the simplest way to fetch a website's HTML content into a String?
Put a tag like $tag for any dynamic content and then do something like this: File htmlTemplateFile = new File("path/template. html"); String htmlString = FileUtils. readFileToString(htmlTemplateFile); String title = "New Page"; String body = "This is Body"; htmlString = htmlString.
Your browser actually parses HTML and render it for you But if we need to parse an HTML document and find some elements, tags, attributes or check if a particular element exists or not. In java, we can extract the HTML content and can parse the HTML Document. Approaches: Using FileReader.
I'm currently using this:
String content = null; URLConnection connection = null; try { connection = new URL("http://www.google.com").openConnection(); Scanner scanner = new Scanner(connection.getInputStream()); scanner.useDelimiter("\\Z"); content = scanner.next(); scanner.close(); }catch ( Exception ex ) { ex.printStackTrace(); } System.out.println(content);
But not sure if there's a better way.
This has worked well for me:
URL url = new URL(theURL); InputStream is = url.openStream(); int ptr = 0; StringBuffer buffer = new StringBuffer(); while ((ptr = is.read()) != -1) { buffer.append((char)ptr); }
Not sure at to whether the other solution(s) provided are any more efficient or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With