I am working on some project where in i need scrap some information from different website.I am using HtmlUnit for this purpose,But problem is i am unable to traverse through the elements on one page.
Example:
  <div id="some_id">
      <div>
        <div>
           <div>
              ......
                       many divs in between
              ......
               <div id="my_target_div"> some information </div>
                ........
                ........
                 </div>
Now how get div with id my_target_div  and information inside that div
Use getHtmlElementById.
Check documentation.
An example:
@Test
public void getElements() throws Exception {
    final WebClient webClient = new WebClient();
    final HtmlPage page = webClient.getPage("http://some_url");
    final HtmlDivision div = page.getHtmlElementById("my_target_div");
    webClient.closeAllWindows();
}
Source.
WebClient webClient = new WebClient();
        HtmlPage page;
  HtmlElement div= (HtmlElement) page2.getFirstByXPath("//div[@id='my_target_div']");
This will solve your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With