Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

jquery like lib in java

I am looking for a simple lightweight java library that parses HTML. I have looked a lot and there are many options out there. But I cannot find something simple. I really would like to have something like pyquery in python except in java. My requirements are: fast, easy to use and lightweight.

What do I need it for? Not sure if this matters, but I need to index parts of an html documents. So I am hoping to be able to select part of that document quickly and then parse it.

like image 514
Amir Raminfar Avatar asked Oct 22 '10 00:10

Amir Raminfar


1 Answers

I have used HTMLParser in the past. I wasn't very happy with it. I found tagsoup and jsoup. I really like jsoup. Haven't used it extensively yet but you can do something like:

Elements resultLinks = doc.select("h3 > a"); // direct a after h3
like image 84
Amir Raminfar Avatar answered Oct 01 '22 03:10

Amir Raminfar