Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java GUI to display webpages and return HTML

I need a workflow like below:

// load xyz.com in the browser window
// the browser is live, meaning users can interact with it
browser.load("http://www.google.com");

// return the HTML of the initially loaded page
String page = browser.getHTML();

// after some time
// user might have navigated to a new page, get HTML again
String newpage = browser.getHTML();

I am surprised to see how hard this is to do with Java GUIs such as JavaFX (http://lexandera.com/2009/01/extracting-html-from-a-webview/) and Swing.

Is there some simple way to get this functionality in Java?

like image 574
Moeb Avatar asked Nov 18 '13 15:11

Moeb


People also ask

What are 3 GUI libraries included in the JDK?

There are current three sets of Java APIs for graphics programming: AWT (Abstract Windowing Toolkit), Swing and JavaFX. AWT API was introduced in JDK 1.0. Most of the AWT UI components have become obsolete and should be replaced by newer Swing UI components.

What is GUI components in Java?

Java's GUI components include labels, text fields, text areas, buttons, etc. The Abstract Windowing Toolkit (AWT) also includes containers which can include these components. Containers include frames (windows), canvases (which are used to draw on), and panels (which are used to group components).


2 Answers

Here is a contrived example using JavaFX that prints the html content to System.out - it should not be too complicated to adapt to create a getHtml() method. (I have tested it with JavaFX 8 but it should work with JavaFX 2 too).

The code will print the HTML content everytime a new page is loaded.

Note: I have borrowed the printDocument code from this answer.

public class TestFX extends Application {

    @Override
    public void start(Stage stage) throws Exception {
        try {
            final WebView webView = new WebView();
            final WebEngine webEngine = webView.getEngine();

            Scene scene = new Scene(webView);

            stage.setScene(scene);
            stage.setWidth(1200);
            stage.setHeight(600);
            stage.show();

            webEngine.getLoadWorker().stateProperty().addListener(new ChangeListener<Worker.State>() {
                @Override
                public void changed(ObservableValue<? extends State> ov, State t, State t1) {
                    if (t1 == Worker.State.SUCCEEDED) {
                        try {
                            printDocument(webEngine.getDocument(), System.out);
                        } catch (Exception e) { e.printStackTrace(); }
                    }
                }
            });

            webView.getEngine().load("http://www.google.com");

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void printDocument(Document doc, OutputStream out) throws IOException, TransformerException {
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer = tf.newTransformer();
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
        transformer.setOutputProperty(OutputKeys.METHOD, "xml");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
        transformer.transform(new DOMSource(doc), new StreamResult(new OutputStreamWriter(out, "UTF-8")));
    }

    public static void main(String[] args) {
        launch(args);
    }
}
like image 109
assylias Avatar answered Oct 20 '22 03:10

assylias


Below you will find a SimpleBrowser component which is a Pane containing a WebView.

Source code at gist.

Sample usage:

SimpleBrowser browser = new SimpleBrowser()
          .useFirebug(true);    

// ^ useFirebug(true) option - will enable Firebug Lite which can be helpful for 
// | debugging - i.e. to inspect a DOM tree or to view console messages 

Scene scene = new Scene(browser);

browser.load("http://stackoverflow.com", new Runnable() {
    @Override
    public void run() {
        System.out.println(browser.getHTML());
    }
});

browser.getHTML() is put inside a Runnable because one needs to wait for a web page to download and render. Trying to invoke this method before page loading will return an empty page, so wrapping this into a runnable is a simple way I came up with to wait for a page to load.

import javafx.beans.value.ChangeListener;
import javafx.beans.value.ObservableValue;
import javafx.concurrent.Worker;
import javafx.scene.layout.*;
import javafx.scene.web.WebEngine;
import javafx.scene.web.WebView;

public class SimpleBrowser extends Pane {
    protected final WebView webView = new WebView();
    protected final WebEngine webEngine = webView.getEngine();

    protected boolean useFirebug;

    public WebView getWebView() {
        return webView;
    }

    public WebEngine getEngine() {
        return webView.getEngine();
    }

    public SimpleBrowser load(String location) {
        return load(location, null);
    }

    public SimpleBrowser load(String location, final Runnable onLoad) {
        webEngine.load(location);

        webEngine.getLoadWorker().stateProperty().addListener(new ChangeListener<Worker.State>() {
            @Override
            public void changed(ObservableValue<? extends Worker.State> ov, Worker.State t, Worker.State t1) {
                if (t1 == Worker.State.SUCCEEDED) {
                    if(useFirebug){
                        webEngine.executeScript("if (!document.getElementById('FirebugLite')){E = document['createElement' + 'NS'] && document.documentElement.namespaceURI;E = E ? document['createElement' + 'NS'](E, 'script') : document['createElement']('script');E['setAttribute']('id', 'FirebugLite');E['setAttribute']('src', 'https://getfirebug.com/' + 'firebug-lite.js' + '#startOpened');E['setAttribute']('FirebugLite', '4');(document['getElementsByTagName']('head')[0] || document['getElementsByTagName']('body')[0]).appendChild(E);E = new Image;E['setAttribute']('src', 'https://getfirebug.com/' + '#startOpened');}");
                    }
                    if(onLoad != null){
                        onLoad.run();
                    }
                }
            }
        });

        return this;
    }

    public String getHTML() {
        return (String)webEngine.executeScript("document.getElementsByTagName('html')[0].innerHTML");
    }

    public SimpleBrowser useFirebug(boolean useFirebug) {
        this.useFirebug = useFirebug;
        return this;
    }

    public SimpleBrowser() {
        this(false);
    }

    public SimpleBrowser(boolean useFirebug) {
        this.useFirebug = useFirebug;

        getChildren().add(webView);

        webView.prefWidthProperty().bind(widthProperty());
        webView.prefHeightProperty().bind(heightProperty());
    }
}

Demo Browser:

import javafx.application.Application;
import javafx.event.ActionEvent;
import javafx.event.EventHandler;
import javafx.scene.Scene;
import javafx.scene.control.Button;
import javafx.scene.control.TextField;
import javafx.scene.layout.HBox;
import javafx.scene.layout.Priority;
import javafx.scene.layout.VBox;
import javafx.scene.layout.VBoxBuilder;
import javafx.stage.Stage;

public class FXBrowser {
    public static class TestOnClick extends Application {


        @Override
        public void start(Stage stage) throws Exception {
            try {
                SimpleBrowser browser = new SimpleBrowser()
                    .useFirebug(true);

                final TextField location = new TextField("http://stackoverflow.com");

                Button go = new Button("Go");

                go.setOnAction(new EventHandler<ActionEvent>() {
                    @Override
                    public void handle(ActionEvent arg0) {
                        browser.load(location.getText(), new Runnable() {
                            @Override
                            public void run() {
                                System.out.println("---------------");
                                System.out.println(browser.getHTML());
                            }
                        });
                    }
                });


                HBox toolbar  = new HBox();
                toolbar.getChildren().addAll(location, go);

                toolbar.setFillHeight(true);

                VBox vBox = VBoxBuilder.create().children(toolbar, browser)
                    .fillWidth(true)
                    .build();


                Scene scene = new Scene( vBox);

                stage.setScene(scene);
                stage.setWidth(1024);
                stage.setHeight(768);
                stage.show();

                VBox.setVgrow(browser, Priority.ALWAYS);

                browser.load("http://stackoverflow.com");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

        public static void main(String[] args) {
            launch(args);
        }
    }
}
like image 31
Andrey Chaschev Avatar answered Oct 20 '22 03:10

Andrey Chaschev