Here's my problem. I have a html content: innerText I need to extract the "innerText". While trying this in Jsoup I found that the innertext goes outside the anchor tag when parsed by Jsoup.
Here's my code
Document doc=Jsoup.parse("<div> <a href="#"> innerText </a> </div>");
System.out.println(doc.html());
output:
<html>
<head></head>
<body>
<div >
<a href="#"></a>innerText
</div>
</body>
</html>
why is "innerText" moved outside the anchor tag?
You can access the text by calling the text()
method on the element.
Document doc = Jsoup.parse("<div> <a href=\"#\"> innerText </a> </div>");
System.out.println(doc.html());
Elements rows = doc.getElementsByTag("a");
for (Element element : rows) {
System.out.println("element = " + element.text());
}
btw. Using your posted code (and JSoup 1.8.1) produces the following output
<html>
<head></head>
<body>
<div>
<a href="#"> innerText </a>
</div>
</body>
</html>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With