I cannot find any direct method like isDisplayed()
in Jsoup Element.
I can check the input with type = "hidden"
by using the following code.
"HIDDEN".equals(elm.attr("type").toUpperCase())
But I need the CSS hidden to be captured as well. And also the inherited hidden elements.
With XPath expressions it is able to select the elements within the HTML using Jsoup as HTML parser.
Where. document − document object represents the HTML DOM. Jsoup − main class to parse the given HTML String. html − HTML String. sampleDiv − Element object represent the html node element identified by id "sampleDiv".
clean. Creates a new, clean document, from the original dirty document, containing only elements allowed by the safelist. The original document is not modified. Only elements from the dirty document's body are used.
Jsoup parses the source code as delivered from the server (or in this case loaded from file). It does not invoke client-side actions such as JavaScript or CSS DOM manipulation.
Pshemo said it already in his comment: JSOUP is not a JavaScript interpreter. And JSOUP does not combine external CSS info into html. JSOUP just interprets html, and it is very good at this. Nothing much more but also nothing much less. You can also access the internet and load html pages with JSOUP, but that is really the limit of it.
About your problem: You should think hard if it is really needed to know if an element is visible or hidden. If it is in your context, you problably need a testing framework that behaves like a browser. For Java there are very good bindings to selenium webdriver. This drives a real browser to load and test pages. You can also scrape the content with selenium. I have good experience using both, selenium for accessing web content and then switching over to JSOUP for actually scraping. In your case you can use the powerful webdriver API directly to find out if an element is hidden or not.
Selenium webdriver is able to work with Firefox, Chrome and a bunch of other browsers. If you need a lightweight alternative you may use a headless browser. For that there exists PhantomJs, which is exellenttly supported by selenium. Or HTMLUnit, which is even lighter and uses the Java Rhino interpreter for JavaScript.
You see, there are quite some options to choose from to achieve what you want. Just not JSOUP, although it is a great library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With