In jsoup Element.children()
returns all children (descendants) of Element. But, I want the Element's first-level children (direct children).
Which method can I use?
With XPath expressions it is able to select the elements within the HTML using Jsoup as HTML parser.
A HTML element consists of a tag name, attributes, and child nodes (including text nodes and other elements). From an Element, you can extract data, traverse the node graph, and manipulate the HTML.
clean. Creates a new, clean document, from the original dirty document, containing only elements allowed by the safelist. The original document is not modified. Only elements from the dirty document's body are used.
Element.children() returns direct children only. Since you get them bound to a tree, they have children too.
If you need the direct children elements without the underlying tree structure then you need to create them as follows
public static void main(String... args) {
Document document = Jsoup
.parse("<div><ul><li>11</li><li>22</li></ul><p>ppp<span>sp</span</p></div>");
Element div = document.select("div").first();
Elements divChildren = div.children();
Elements detachedDivChildren = new Elements();
for (Element elem : divChildren) {
Element detachedChild = new Element(Tag.valueOf(elem.tagName()),
elem.baseUri(), elem.attributes().clone());
detachedDivChildren.add(detachedChild);
}
System.out.println(divChildren.size());
for (Element elem : divChildren) {
System.out.println(elem.tagName());
}
System.out.println("\ndivChildren content: \n" + divChildren);
System.out.println("\ndetachedDivChildren content: \n"
+ detachedDivChildren);
}
Output
2
ul
p
divChildren content:
<ul>
<li>11</li>
<li>22</li>
</ul>
<p>ppp<span>sp</span></p>
detachedDivChildren content:
<ul></ul>
<p></p>
This should give you the desired list of direct descendants of the parent node:
Elements firstLevelChildElements = doc.select("parent-tag > *");
OR You can also try to retrieve the parent element, get the first child node via child(int index)
and then try to retrieve siblings of this child via siblingElements()
.
This will give you the list of first level children excluding the used child, however you'd have to add the child externally.
Elements firstLevelChildElements = doc.child(0).siblingElements();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With