Im coding in Java..
Does anyone know how i can get the content of a javax.swing.text.html.HTMLDocument as a String? This is what i´ve got so far...
URL url = new URL( "http://www.test.com" );
HTMLEditorKit kit = new HTMLEditorKit();
HTMLDocument doc = (HTMLDocument) kit.createDefaultDocument();
doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
Reader HTMLReader = new InputStreamReader(url.openConnection().getInputStream());
kit.read(HTMLReader, doc, 0);
I need the content of the HTMLDocument as a String.
Example:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head><meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1">
....... etc.
Any help would be appreciated. I need to use HTMLDocument class in order for the html to be processed correctly :)
Thanks Daniel
click(function(){ var text = $("#tab"). html(); // taking the content var res = text.
If you just want to parse HTML and your HTML is intended for the body of your document, you could do the following : (1) var div=document. createElement("DIV"); (2) div. innerHTML = markup; (3) result = div. childNodes; --- This gives you a collection of childnodes and should work not just in IE8 but even in IE6-7.
parseHTML uses native methods to convert the string to a set of DOM nodes, which can then be inserted into the document. These methods do render all trailing or leading text (even if that's just whitespace).
StringWriter writer = new StringWriter();
kit.write(writer, doc, 0, doc.getLength());
String s = writer.toString();
You don't need the editor and reader at all - just read the input stream. For example, with commons-io IOUtils.toString(inputStream)
or you can use:
Content content = document.getContent();
String str = content.getString(0, content.length() - 1);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With