I'm looking for documentation (officially documentation if it is possible) for TagSoup and jTidy libraries.
I want use this libraries to manipulate html "tagsoup" files that include xml tags with different namespaces mixed between html (html, xhtml or html5) tags.
I have tested HTMLCleaner, NekoHTML and Jericho, but i don't find documentation for jTidy and TagSoup, apart from simplest examples to clear a file.
I need documentation about manipulate contents, replace tags, extract info, etc...
Thanks
Note: After test all options, I used StAX / Woodstox :
http://wiki.fasterxml.com/WoodstoxHome
https://en.wikipedia.org/wiki/StAX
https://docs.oracle.com/javase/tutorial/jaxp/stax/using.html
The answer to a similar question on the tagsoup-friends google group may help:
Documentation for TagSoup
You've probably already seen them, but the javadoc for JTidy is available here: http://jtidy.sourceforge.net/apidocs/index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With