Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to take a webpage screenshot?

I am using this code below but the image generated is broken. I think probably it is because of the renderization options. Does anybody know what is happening?

package webpageprinter;

import java.net.URL;
import java.awt.image.BufferedImage;
import javax.imageio.ImageIO;
import java.beans.PropertyChangeListener;
import java.beans.PropertyChangeEvent;
import javax.swing.text.html.*;
import java.awt.*;
import javax.swing.*;
import java.io.*;

public class WebPagePrinter {
private BufferedImage image = null;

public BufferedImage Download(String webpageurl) {
try
{
    URL url = new URL(webpageurl);
    final JEditorPane jep = new JEditorPane();
    jep.setContentType("text/html");
    ((HTMLDocument)jep.getDocument()).setBase(url);
    jep.setEditable(false);
    jep.setBounds(0,0,1024,768);
    jep.addPropertyChangeListener("page",new
    PropertyChangeListener() {
                @Override
    public void propertyChange(PropertyChangeEvent e) {
    try
    {
        image = new
        BufferedImage(1024,768,BufferedImage.TYPE_INT_RGB );
        Graphics g = image.getGraphics();
        Graphics2D graphics = (Graphics2D) g;
        graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
        jep.paint(graphics);
        ImageIO.write(image,"png",new File("C:/webpage.png"));
    }
    catch (Exception re)
    {
        re.printStackTrace();
    }
    }});
    jep.setPage(url);

}
catch (Exception e)
{
e.printStackTrace();
}
return image;
}

    public static void main(String[] args) {

        new WebPagePrinter().Download("http://www.google.com");

    }
}
like image 286
Felipe Dias Avatar asked Aug 11 '11 20:08

Felipe Dias


3 Answers

I think there are 3 problems and one fragility in that code:

Problems

  1. JEditorPane was never intended to be a browser.
  2. setPage(URL) loads asynchronously. It is necessary to add a listener to determine when the page has loaded.
  3. You might find some sites automatically refuse connections to Java clients.

Fragility

The fragile nature is included with the call to setBounds(). Use layouts.

Image at 400x600

Google screen shot

But looking at this image, it seems 3 does not apply here, 2 is not the problem. It comes down to point 1. JEditorPane was never intended as a browsing component. Those random characters at the bottom are JavaScript that the JEP is not only not scripting, but then improperly displaying in the page.

like image 180
Andrew Thompson Avatar answered Nov 13 '22 09:11

Andrew Thompson


You can do an entire screen capture using Java Robot (API Here).

import java.awt.AWTException;
import java.awt.Rectangle;
import java.awt.Robot;
import java.awt.Toolkit;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

public class RobotExp {

    public static void main(String[] args) {

        try {

            Robot robot = new Robot();
            // Capture the screen shot of the area of the screen defined by the rectangle
            BufferedImage bi=robot.createScreenCapture(new Rectangle(Toolkit.getDefaultToolkit().getScreenSize()));
            ImageIO.write(bi, "jpg", new File("C:/imageTest.jpg"));

        } catch (AWTException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This example was found here. With some modifications by me.

like image 45
hbhakhra Avatar answered Nov 13 '22 07:11

hbhakhra


Your problem is that you're using Java's JEditorPane to render the webpage, which has a very limited HTML rendering engine. It is simply not able to display more complex webpages as well as a modern Browser.

If you need to produce screenshots of correctly rendered complex webpages using Java, the best way is probably to use Selenium to control a real browser like Firefox.

like image 31
Michael Borgwardt Avatar answered Nov 13 '22 08:11

Michael Borgwardt