Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the 'webpage' Phantom module in node.js

I am trying to wrap a PhantomJS script in a node.js process. The phantom script grabs a url from the arguments provided on the command line and outputs a pdf (much similar to the rasterize.js example included with the pahntom install).

The phantom script I have works fine, it's just my employer wants a node script if possible. No problem, I can use the node-phantom node module to wrap it.

But now I've hit a stumbling block, my phantom script has:

var page = require('webpage').create();

So, node.js is trying to find a module called 'webpage', the 'webpage' module is built into the phantom install so node can't find it. As far as I can tell, there is no npm module called 'webpage'.

'webpage' is used like this:

page.open(address, function (status) {

    if (status !== 'success') {

        // --- Error opening the webpage ---
        console.log('Unable to load the address!');

    } else {

        // --- Keep Looping Until Render Completes ---
        window.setTimeout(function () {
            page.render(output);
            phantom.exit();
        }, 200);
    }
});

where address is the url specified on the command line and output is another argument, the name and type of the file.

Can anyone help me out? This is quite an abstract one so I'm not expecting much if I'm honest, worth a try though.

Thanks.

EDIT - Approx 2hrs later

I now have this which throws out a PDF:

var phanty = require('node-phantom');

var system = require('system');

phanty.create(function(err,phantom) {

    //var page = require('webpage').create();

    var address;
    var output;
    var size;

    if (system.args.length < 4 || system.args.length > 6) {

        // --- Bad Input ---

        console.log('Wrong usage, you need to specify the BLAH BLAH BLAH');
        phantom.exit(1);

    } else {

        phantom.createPage(function(err,page){

            // --- Set Variables, Web Address, Output ---
            address = system.args[2];
            output = system.args[3];
            page.viewportSize = { width: 600, height: 600 };


            // --- Set Variables, Web Address ---
            if (system.args.length > 4 && system.args[3].substr(-4) === ".pdf") {

                // --- PDF Specific ---
                size = system.args[4].split('*');
                page.paperSize = size.length === 2 ? { width: size[0], height: size[1], margin: '0px' }
                                                   : { format: system.args[4], orientation: 'portrait', margin: '1cm' };
            }

            // --- Zoom Factor (Should Never Be Set) ---
            if (system.args.length > 5) {
                page.zoomFactor = system.args[5];
            } else {
                page.zoomFactor = 1;
            }

            //----------------------------------------------------

            page.open(address ,function(err,status){

                if (status !== 'success') {

                    // --- Error opening the webpage ---
                    console.log('Unable to load the address!');

                } else {

                    // --- Keep Looping Until Render Completes ---
                    process.nextTick(function () {
                        page.render(output);
                        phantom.exit();
                    }, 200);
                }

            });

        });
    }
});

But! It's not the right size! The page object created using the phantom 'webpage' create() function looks like this before it's passed the URL:

phantom returned page

Whereas mine in my node script, looks like this:

my page

Is it possible to hard code the properties to achieve A4 formatting? What properties am I missing?

I'm so close!

like image 452
Adam Waite Avatar asked Oct 18 '12 14:10

Adam Waite


People also ask

What is the use of PhantomJS?

PhantomJS is a discontinued headless browser used for automating web page interaction. PhantomJS provides a JavaScript API enabling automated navigation, screenshots, user behavior and assertions making it a common tool used to run browser-based unit tests in a headless system like a continuous integration environment.

How do I run PhantomJS?

For Windows Download the zip file, unpack it and you will get an executable phantom.exe. Set the PATH environment variable to the path of phantom.exe file. Open a new command prompt and type phantomjs –v. It should give you the current version of PhantomJS that is running.

CAN node js be used on a website?

Mozilla Firefox is a popular web browser. Many of its web apps use Node. js because of its memory capacity and for the ease of use involved in being able to keep everything in a single JavaScript repository.


2 Answers

It should be something like:

var phantom=require('../node-phantom');
phantom.create(function(error,ph){
  ph.createPage(function(err,page){
    page.open(url ,function(err,status){
      // do something
    });
  });
});

Your confusion here is because you want to reuse the same concepts and metaphors from your PhantomJS script. It does not work that way. I suggest that you spend some time studying the included tests of node-phantom, see https://github.com/alexscheelmeyer/node-phantom/tree/master/test.

like image 141
Ariya Hidayat Avatar answered Oct 10 '22 01:10

Ariya Hidayat


Using https://github.com/sgentle/phantomjs-node I have made an A4 page in nodejs using phantom with the following code:

phantom.create(function(ph){
    ph.createPage(function(page) {
        page.set("paperSize", { format: "A4", orientation: 'portrait', margin: '1cm' });
        page.open("http://www.google.com", function(status) {
            page.render("google.pdf", function(){
                console.log("page rendered");
                ph.exit();
            })
        })
    })

});

Side Note:

the page.set() function takes any variable that you would set in the rasterize.js example. See how paperSize is set above and compare it to the relevant lines in rasterize.js

like image 20
Ryan Knell Avatar answered Oct 10 '22 03:10

Ryan Knell