I am trying to parse a webpage and extract data using Jsoup. But the link is dynamic and throws up a wait-for-loading page before displaying the details. So the Jsoup seems to process the waiting page rather than the details page. is there anyway to make this wait till page is fully loaded?
If some of the content is created dynamically once the page is loaded, then your best chance to parse the full content would be to use Selenium with JSoup:
WebDriver driver = new FirefoxDriver();
driver.get("http://stackoverflow.com/");
Document doc = Jsoup.parse(driver.getPageSource());
Probably, the page in question is t generated by JavaScript in the browser (client-side). Jsoup does not interpret JavaScript, so you are out of luck. However, you could analyze the page loading in the network tab of the browser developer tools and find out which AJAX calls are made during page load. These calls also have URLs and you may get all infos you need by directly accessing them. Alternatively, you can use a real browser engine to load the page. You can use a library like selenium webdriver for that or the JavaFX webkit component if you are using Java 8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With