Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fetch HTML in Java

Without the use of any external library, what is the simplest way to fetch a website's HTML content into a String?

like image 349
pek Avatar asked Aug 28 '08 01:08

pek


People also ask

How do you use HTML in Java?

Put a tag like $tag for any dynamic content and then do something like this: File htmlTemplateFile = new File("path/template. html"); String htmlString = FileUtils. readFileToString(htmlTemplateFile); String title = "New Page"; String body = "This is Body"; htmlString = htmlString.

Can we read HTML file in Java?

Your browser actually parses HTML and render it for you But if we need to parse an HTML document and find some elements, tags, attributes or check if a particular element exists or not. In java, we can extract the HTML content and can parse the HTML Document. Approaches: Using FileReader.


2 Answers

I'm currently using this:

String content = null; URLConnection connection = null; try {   connection =  new URL("http://www.google.com").openConnection();   Scanner scanner = new Scanner(connection.getInputStream());   scanner.useDelimiter("\\Z");   content = scanner.next();   scanner.close(); }catch ( Exception ex ) {     ex.printStackTrace(); } System.out.println(content); 

But not sure if there's a better way.

like image 76
pek Avatar answered Oct 02 '22 16:10

pek


This has worked well for me:

URL url = new URL(theURL); InputStream is = url.openStream(); int ptr = 0; StringBuffer buffer = new StringBuffer(); while ((ptr = is.read()) != -1) {     buffer.append((char)ptr); } 

Not sure at to whether the other solution(s) provided are any more efficient or not.

like image 41
Scott Bennett-McLeish Avatar answered Oct 02 '22 15:10

Scott Bennett-McLeish