Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove parent node while preserving child node in jsoup

Tags:

html

jsoup

Consider this sample code-

<div>
<outer-tag> Some Text <inner-tag> Some more text</inner-tag></outer-tag>
</div>

I want to get the following output -

<div>
<inner-tag> Some more text</inner-tag>
</div>

How do I achieve this? Thanks!

like image 263
Archit Arora Avatar asked Feb 19 '15 07:02

Archit Arora


1 Answers

This solution will work for your current example:

String html = "<div>"
                + "<outer-tag> Some Text <inner-tag> Some more text</inner-tag></outer-tag>"
                + "</div>";     

Document doc = Jsoup.parseBodyFragment(html);

for (Element _div : doc.select("div")) {

    // get the unwanted outer-tag
    Element outerTag = _div.select("outer-tag").first();

    // delete any TextNodes that are within outer-tag
    for (Node child : outerTag.childNodes()) {
        if (child instanceof TextNode) child.remove();
    }

    // unwrap to remove outer-tag and move inner-tag to child of parent div
    outerTag.unwrap();

    // print the result 
    System.out.println(_div);
}

Result is:

<div>
 <inner-tag>
    Some more text
 </inner-tag>
</div>
like image 66
ashatte Avatar answered Sep 29 '22 09:09

ashatte