I'm trying to parse an html document with Jsoup to get all heading tags. In addition I need to group the heading tags as [h1] [h2] etc...
hh = doc.select("h[0-6]");
but this give me an empty array.
Your selector means h-Tag with attribute "0-6" here - not a regex. But you can combine multiple selectors instead: hh = doc.select("h0, h1, h2, h3, h4, h5, h6");
.
Grouping: do you need a group with all h-Tags + a group for each h1, h2, ... tag or only a group for each h1, h2, ... tag?
Here's an example how you can do this:
// Group of all h-Tags
Elements hTags = doc.select("h1, h2, h3, h4, h5, h6");
// Group of all h1-Tags
Elements h1Tags = hTags.select("h1");
// Group of all h2-Tags
Elements h2Tags = hTags.select("h2");
// ... etc.
If you want a group for each h1, h2, ... tag you can drop first selector and replace hTags
with doc
in the others.
Use doc.select("h1,h2,h3,h4,h5,h6") to get all heading tags. Use doc.select("h1") to get each of those tags separately. See the various things you can do with a select statement in http://preciselyconcise.com/apis_and_installations/jsoup/j_selector.php
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With