Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get all div elements with jsoup?

Tags:

java

html

jsoup

I am learning how to use jsoup. I first thought jsoup was like jquery but it is not.

I want to extract this html body to elements of divs.

<html>
<head></head>
    <body>
            <div>
                <h1>Title</h1>
            </div>
            <div>
                <img src="/xx.jpg" />
            </div>
            <div>
                <p>Paragraph 1</p>
                <p>Paragraph 2</p>
            </div>
            <div>
                <h2><b>End</b></h2>
            </div>
        </body>
</html>

I am using this code:

Document doc = Jsoup.parse(htmlString);
Elements divs = doc.select("div");

but it returns all of the divs. I want elements to return like this :

divs.get(0).toString(); // "<h1>Title</h1>"
divs.get(1).toString(); // "<img src="/xx.jpg" />"
divs.get(2).toString(); // "<p>Paragraph 1</p><p>Paragraph 2</p>"
divs.get(3).toString(); // "<h2><b>End</b></h2>"

Please assist me in getting divs to elements with jsoup and return as above?

like image 379
yozawiratama Avatar asked Oct 02 '14 14:10

yozawiratama


2 Answers

Using .html() will retrieve an inner html.

Document doc = Jsoup.parse(htmlString);
Elements divs = doc.select("div");
//divs.get(0).html();
for(Element elem : divs){
  System.out.println(elem.html()); //get all elements inside div
}

If you want to include div you can use .outerHtml().

like image 114
Ken de Guzman Avatar answered Oct 10 '22 16:10

Ken de Guzman


Do divs.get(0).html();

It will give you the inner html of the tag

like image 31
RaunaqSuri Avatar answered Oct 10 '22 17:10

RaunaqSuri