Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating an HTMLDocument from a String of HTML (in Java)

I'm working on a method that takes a String of HTML and returns an analogous

 javax.swing.text.html.HTMLDocument

What is the most efficient way of doing this?

The way I'm currently doing this is to use a SAX parser to parse the HTML string. I keep track of when I hit open tags (for example, <i>). When I hit the corresponding close tag (for example, </i>), I apply the italics style to the characters I've hit in between.

This certainly works, but it's not fast enough. Is there a faster way of doing this?

like image 861
Paul Reiners Avatar asked Jul 14 '11 18:07

Paul Reiners


2 Answers

Agree with mouser but a small correction

Reader stringReader = new StringReader(string);
HTMLEditorKit htmlKit = new HTMLEditorKit();
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument();
htmlKit.read(stringReader, htmlDoc, 0);
like image 125
StanislavL Avatar answered Sep 25 '22 06:09

StanislavL


Try to use HtmlEditorKit class. It supports parsing of HTML content that can be read directly from String (e.g. through StringReader). There seems to be an article about how to do this.

Edit: To give an example, basically I think it could be done like this (aftrer the code is executed, htmlDoc should contain the loaded document...):

Reader stringReader = new StringReader(string);
HTMLEditorKit htmlKit = new HTMLEditorKit();
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument();
HTMLEditorKit.Parser parser = new ParserDelegator();
parser.parse(stringReader, htmlDoc.getReader(0), true);
like image 23
peterm Avatar answered Sep 21 '22 06:09

peterm