I am working on some project where in i need scrap some information from different website.I am using HtmlUnit
for this purpose,But problem is i am unable to traverse through the elements on one page.
Example:
<div id="some_id">
<div>
<div>
<div>
......
many divs in between
......
<div id="my_target_div"> some information </div>
........
........
</div>
Now how get div
with id my_target_div
and information inside that div
Use getHtmlElementById.
Check documentation.
An example:
@Test
public void getElements() throws Exception {
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage("http://some_url");
final HtmlDivision div = page.getHtmlElementById("my_target_div");
webClient.closeAllWindows();
}
Source.
WebClient webClient = new WebClient();
HtmlPage page;
HtmlElement div= (HtmlElement) page2.getFirstByXPath("//div[@id='my_target_div']");
This will solve your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With