Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaFX WebEngine wait for ajax to complete

I'm developing a data mining application in JavaFX which relies on the WebView (and thus also the WebEngine). The mining happens in 2 steps: first the user uses the UI to navigate to a website in the WebView to configure where interesting data can be searched. Second, using a background task that periodically runs, a WebEngine loads the same document and tries to extract the data from the loaded document.

This works perfectly for most cases but recently I've ran into some trouble with pages that use AJAX to render content. To check if the WebEngine has loaded the document, I listen to the loadWorker's stateProperty. If the state transitions to succesfull, I know the document is loaded (together with any javascript that might run on document.ready() or equivalent). This because javascript is executed on the JavaFX thread if I'm not mistaken (source: https://blogs.oracle.com/javafx/entry/communicating_between_javascript_and_javafx). However, if an AJAX call is started, the javascript execution finishes and the engine lets me know the document is ready though it is obviously not as the contents might still change due to the outstanding AJAX call.

Is there any way around this, to inject a hook so I am notified when AJAX calls are finished? I've tried installing a default complete handler in $.ajaxSetup() but that is quite dodgy because if an ajax call overrides the complete handler, the default won't be called. Plus, I can only inject this after the document is first loaded (and by then some AJAX calls may already be running). I've tested this injection with an upcall and it works fine for AJAX calls that are launched on command (after the injection of the default handler) that don't supply their own complete handler.

I'm looking for two things: firstly: a generic way to hook into the completion handler of AJAX calls, and secondly: a way to wait for the WebEngine to finish all AJAX calls and notify me afterwards.

like image 863
RDM Avatar asked Jan 25 '14 17:01

RDM


1 Answers

Explanation

I've also had this problem and solved it by providing my own implementation of sun.net.www.protocol.http.HttpURLConnection which I use to process any AJAX requests. My class, conveniently called AjaxHttpURLConnection, hooks into the getInputStream() function, but does not return its original input stream. Instead, I give an instance of PipedInputStream back to the WebEngine. I then read all the data coming from the original input stream and pass it on to my piped stream. This way, I gain 2 benefits:

  1. I know when the last byte has been received and thus the AJAX request has been processed completely.
  2. I even can grab all the incoming data and already work with it (if I wanted to).


Example

First, you will have to tell Java to use your URLConnection implementation instead of the default one. To do so, you must provide it with your own version of the URLStreamHandlerFactory. You can find many threads here on SO (e.g. this one) or via Google on this topic. In order to set your factory instance, put the following somewhere early in your main method. This is what mine looks like.

import java.net.URLStreamHandler;
import java.net.URLStreamHandlerFactory;

public class MyApplication extends Application {

    // ...

    public static void main(String[] args) {
        URL.setURLStreamHandlerFactory(new URLStreamHandlerFactory() {
            public URLStreamHandler createURLStreamHandler(String protocol) {
                if ("http".equals(protocol)) {
                    return new MyUrlConnectionHandler();    
                }
                return null; // Let the default handlers deal with whatever comes here (e.g. https, jar, ...)
            }
        });
        launch(args);
    }
}

Second, we have to come up with our own Handler that tells the programme when to use which type of URLConnection.

import java.io.IOException;
import java.net.Proxy;
import java.net.URL;
import java.net.URLConnection;

import sun.net.www.protocol.http.Handler;
import sun.net.www.protocol.http.HttpURLConnection;

public class MyUrlConnectionHandler extends Handler {

    @Override
    protected URLConnection openConnection(URL url, Proxy proxy) throws IOException {

        if (url.toString().contains("ajax=1")) {
            return new AjaxHttpURLConnection(url, proxy, this);
        }

        // Return a default HttpURLConnection instance.
        return new HttpURLConnection(url, proxy);
    }
}

Last but not least, here comes the AjaxHttpURLConnection.

import java.io.IOException;
import java.io.InputStream;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
import java.net.Proxy;
import java.net.URL;
import java.util.concurrent.locks.ReentrantLock;

import org.apache.commons.io.IOUtils;

import sun.net.www.protocol.http.Handler;
import sun.net.www.protocol.http.HttpURLConnection;

public class AjaxHttpURLConnection extends HttpURLConnection {

    private PipedInputStream pipedIn;
    private ReentrantLock lock;

    protected AjaxHttpURLConnection(URL url, Proxy proxy, Handler handler) {
        super(url, proxy, handler);
        this.pipedIn = null;
        this.lock = new ReentrantLock(true);
    }

    @Override
    public InputStream getInputStream() throws IOException {

        lock.lock();
        try {

            // Do we have to set up our own input stream?
            if (pipedIn == null) {

                PipedOutputStream pipedOut = new PipedOutputStream();
                pipedIn = new PipedInputStream(pipedOut);

                InputStream in = super.getInputStream();
                /*
                 * Careful here! for some reason, the getInputStream method seems
                 * to be calling itself (no idea why). Therefore, if we haven't set
                 * pipedIn before calling super.getInputStream(), we will run into
                 * a loop or into EOFExceptions!
                 */

                // TODO: timeout?
                new Thread(new Runnable() {
                    public void run() {
                        try {

                            // Pass the original data on to the browser.
                            byte[] data = IOUtils.toByteArray(in);
                            pipedOut.write(data);
                            pipedOut.flush();
                            pipedOut.close();

                            // Do something with the data? Decompress it if it was
                            // gzipped, for example.

                            // Signal that the browser has finished.

                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
                }).start();
            }
        } finally {
            lock.unlock();
        }
        return pipedIn;
    }
}


Further Considerations

  • If you are using multiple WebEngine objects, it might be tricky to tell which one actually opened the URLConnection and thus which browser has finished loading.
  • You might have noticed that I only dealth with http connections. I have not tested to what extent my approach could be transferred to https etc. (not an expert here :O).
  • As you have seen, my only means to know when to actually use my AjaxHttpURLConnection is when the corresponding url contains ajax=1. In my case, this was sufficient. Since I am not too good with html and http, however, I don't know if the WebEngine can make AJAX requests in any different way (e.g. the header fields?). If in doubt, you could simply always return an instance of our modified url connection, but that would of course mean some overhead.
  • As stated in the beginning, you can immediately work with the data once it has been retrieved from the input stream if you wish to do so. You can grab the request data that your WebEngine sends in a similar way. Just wrap the getOutputStream() function and place another intermediate stream to grab whatever is being sent and then pass it on to the original output stream.
like image 147
MightyMalcolm Avatar answered Sep 28 '22 03:09

MightyMalcolm