Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where can I find a good HTMLEditorKit tutorial/reference, which actually explains how to edit HTML documents? [closed]

My intention is to edit HTML documents, including modifying existing elements, deleting elements and inserting new ones.

I've read HTMLEditorKit's and related classes' documentation, as well as the relevant topic in Sun's Java Trail, yet there is very little information about actual HTML document manipulation. Most of the discussion and examples deal with reading and parsing HTML, not really editing it. Some Googling still did not yield an adequate solution, and trying to tackle the task with some coding trial and error mostly resulted in exceptions.

I've gone over related questions and answers here in SO, but most answers suggested some alternative, while I'm looking for a solution in the JDK. Perhaps HTMLEditorKit is of little use to non-swing applications, and there is an alternative outside javax.swing?

Here are a few tasks I'd like to learn how to perform:

  • Replace text in certain text fields.
  • Basic editing (find/replace or regexes) of <script> elements.
  • Color the border of certain elements.
  • Remove certain tags entirely (for example flash elements).

Assuming that HTMLEditorKit is the best HTML editing component in the JDK, what tutorial or reference do you recommend?

like image 587
Oren Shalev Avatar asked Oct 15 '22 13:10

Oren Shalev


2 Answers

The HTMLEditorKit is not an HTML editor but an editor for document models which allows to convert these document models from and to HTML. The internal model of the editor kit is not "HTML" but is based on DefaultStyledDocument. What confuses you is that there is a HTMLDocument class. But that is just a thin wrapper for the DefaultStyledDocument so it can be created from HTML and saved as HTML.

What you need is an HTML parser. Try jTidy. It will read the HTML, build an internal model (keeping things like <script> which HTMLEditorKit will ignore). You can then use a DOM API to modify the model.

That said, for many use cases, it's enough to filter the HTML with regular expressions or simple string search&replace.

like image 149
Aaron Digulla Avatar answered Oct 21 '22 09:10

Aaron Digulla


I don't know about you but I think if the html page you are trying to manipulate isn't very complicated then you can built it yourself like that:

HTMLDocument doc = new HTMLDocument();

HTMLEditorKit kit = new HTMLEditorKit();

jEditorPane.setDocument(doc);

jEditorPane.setEditorKit(kit);

kit.insertHTML(doc, doc.getLength(), "<label> This label will be inserted inside the body  directly </label>", 0, 0, null);
kit.insertHTML(doc, doc.getLength(), "<br/>", 0, 0, null);
kit.insertHTML(doc, doc.getLength(), putYourVariableHere, 0, 0, null);

That way you can have full control over the html and it is faster to load than loading it from a outer html.

like image 37
RyanSF Avatar answered Oct 21 '22 09:10

RyanSF