Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JTidy Node.findBody() — How to use?

I'm trying to do XHTML DOM parsing with JTidy, and it seems to be rather counterintuitive task. In particular, there's a method to parse HTML:

Node Tidy.parse(Reader, Writer)

And to get the <body /> of that Node, I assume, I should use

Node Node.findBody(TagTable)

Where should I get an instance of that TagTable? (Constructor is protected, and I haven't found a factory to produce it.)

I use JTidy 8.0-SNAPSHOT.

like image 793
ansgri Avatar asked Dec 30 '22 09:12

ansgri


1 Answers

I found there's much simpler method to extract the body:

tidy = new Tidy();
tidy.setXHTML(true);
tidy.setPrintBodyOnly(true);

And then use tidy on the Reader-Writer pair.

Simple as it should be.

like image 157
ansgri Avatar answered Jan 10 '23 07:01

ansgri