I need to parse page, everything is ok except some elements on page are loaded dynamically. I used jsoup for static elements, then when I realized that I really need dynamic elements I tried javafx. I read a lot of answeres on stackoverflow and there were many recommendations to use javafx WebEngine. So I ended with this code.
@Override
public void start(Stage primaryStage) {
WebView webview = new WebView();
final WebEngine webengine = webview.getEngine();
webengine.getLoadWorker().stateProperty().addListener(
new ChangeListener<State>() {
public void changed(ObservableValue ov, State oldState, State newState) {
if (newState == Worker.State.SUCCEEDED) {
Document doc = webengine.getDocument();
//Serialize DOM
OutputFormat format = new OutputFormat (doc);
// as a String
StringWriter stringOut = new StringWriter ();
XMLSerializer serial = new XMLSerializer (stringOut, format);
try {
serial.serialize(doc);
} catch (IOException e) {
e.printStackTrace();
}
// Display the XML
System.out.println(stringOut.toString());
}
}
});
webengine.load("http://detail.tmall.com/item.htm?spm=a220o.1000855.0.0.PZSbaQ&id=19378327658");
primaryStage.setScene(new Scene(webview, 800, 800));
primaryStage.show();
}
I made string from org.w3c.dom.Document and printed it. But it was useless too. primaryStage.show() showed me fully loaded page (with element I need rendered on page), but there was no element I need in html code (in output).
This is the third day I'm working on that issue, of course lack of experience is my main problem, nevertheless I have to say: I'm stuck. This is my first java project after reading java complete reference. I make it to get some real-world experience (and for fun). I want to make parser of chinese "ebay".
Here is the problem and my test cases:
http://detail.tmall.com/item.htm?spm=a220o.1000855.0.0.PZSbaQ&id=19378327658 need to get dynamically loaded discount "129.00"
http://item.taobao.com/item.htm?spm=a230r.1.14.67.MNq30d&id=22794120348 need "15.20"
As you can see, if you view this pages with browser at first you see original price and after a second or so - discount.
Is it even possible to get this dynamic discounts from html page? Other elements I need to parse are static. What to try next: another library to render html with javascript or maybe smth else? I really need some advice, don't want to give up.
1) First, create a div section and add some text to it using <p> tags. 2) Create an element <p> using document. createElement("p"). 3) Create a text, using document.
What is the best way to make sure javascript is running when page is fully loaded? If you mean "fully loaded" literally, i.e., all images and other resources downloaded, then you have to use an onload handler, e.g.: window. onload = function() { // Everything has loaded, so put your code here };
Method 1: Using the on() method with the load event: The on() method in jQuery is used to attach an event handler for any event to the selected elements. The window object is first selected using a selector and the on() method is used on this element.
DOM model returned after Worker.State.SUCCEEDED
shoulb be already processed by javascript.
Your code worked for me with tested with FX 7u40 and 8.0 dev. I see next output in the log:
<DIV id="J_PromoBox"><EM class="tb-promo-price-type">夏季新品</EM><EM class="tm-yen">¥</EM>
<STRONG class="J_CurPrice">129.00</STRONG></DIV>
which is dynamically loaded box with data (129.00
) you looked for.
You may want to upgrade your JDK to 7u40 or revisit your log parsing algorithm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With