I am trying to access some content on a web page that is created by some Javascript. However, the content that I wish to access is created by the javascript after the page has loaded so this chunk of Html source is no where to be found when I try and parse it with Jsoup.
My code for getting the Html source, using HtmlUnit is as follows:
public static void main(String[] args) throws IOException {
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
String url = "myUrl.com";
out.println("accessing " + url);
HtmlPage page = webClient.getPage(url);
out.println("waiting for js");
webClient.waitForBackgroundJavaScriptStartingBefore(200);
webClient.waitForBackgroundJavaScript(20000);
out.println(page.asXml());
webClient.close();
}
But when I run it, the Html that is supposed to be created is not printed. I was wondering how do I get this Html source, created by the Javascript, using HtmlUnit and then getting said result and passing it to Jsoup for parsing?
Jsoup is server side processing framework,
I am not sure what is your final goal, I assume you want to use it in the same page so I will go with Ajax so you can do:
Something like:
.
$( document ).ready(function() {
var allClientSideHtml = $("html").html();
var dataToSend = JSON.stringify({'htmlSendToSever':allClientSideHtml });
$.ajax({ url: "your_Jsoup_server_url.jsp_or_php/YourJsoupParser",
type: "POST",
contentType: "application/json; charset=utf-8",
dataType: "json",
data: dataToSend , // pass that text to the server as a JSON String
success: function (msg) { alert(msg.d); },
error: function (type) { alert("ERROR!!" + type.responseText); }
});
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With