Are there better ways to read an entire html file to a single string variable than:
String content = ""; try { BufferedReader in = new BufferedReader(new FileReader("mypage.html")); String str; while ((str = in.readLine()) != null) { content +=str; } in.close(); } catch (IOException e) { }
The readString() method of File Class in Java is used to read contents to the specified file. Return Value: This method returns the content of the file in String format. Note: File. readString() method was introduced in Java 11 and this method is used to read a file's content into String.
Just call the method html2text with passing the html text and it will return plain text.
Its party trick is a CSS selector syntax to find elements, e.g.: String html = "<html><head><title>First parse</title></head>" + "<body><p>Parsed HTML into a doc. </p></body></html>"; Document doc = Jsoup. parse(html); Elements links = doc.
There's the IOUtils.toString(..)
utility from Apache Commons.
If you're using Guava
there's also Files.readLines(..)
and Files.toString(..)
.
You should use a StringBuilder:
StringBuilder contentBuilder = new StringBuilder(); try { BufferedReader in = new BufferedReader(new FileReader("mypage.html")); String str; while ((str = in.readLine()) != null) { contentBuilder.append(str); } in.close(); } catch (IOException e) { } String content = contentBuilder.toString();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With