Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Phantomjs does not execute function in page.evaluate function

I'm scraping a Facebook page with the PhantomJS node module (https://github.com/sgentle/phantomjs-node), but when I try evaluating the page, it does not evaluate the function I pass to it. Executing it in a standalone script and running it with the Node interpreter works.. The same code in an Express.js app does not work.

This is my code

facebookScraper.prototype.scrapeFeed = function (url, cb) {
    f = ':scrapeFeed:';

    var evaluator = function (s) {
        var posts = [];

        for (var i = 0; i < FEED_ITEMS; i++) {
            log.info(__filename+f+' iterating step ' + i);
            log.info(__filename+f+util.inspect(document, false, null));
        }

        return {
            news: posts
        };
    }

    phantom.create(function (ph) {
        ph.createPage(function (page) {
            log.fine(__filename+f+' opening url ' + url);
            page.open(url, function (status) {
                log.fine(__filename+f+' opened site? ' + status);
                setTimeout(function() {
                    page.evaluate(evaluator, function (result) {
                        log.info(__filename+f+'Scraped feed: ' + util.inspect(result, false, null));
                        cb(result, ph);
                    });
                }, 5000);
            });
        });
    });
};

The output I get:

{"level":"fine","message":"PATH/fb_regular.js:scrapeFeed: opening url <URL> ","timestamp":"2012-09-23T18:35:10.151Z"}
{"level":"fine","message":"PATH/fb_regular.js:scrapeFeed: opened site? success","timestamp":"2012-09-23T18:35:12.682Z"}
{"level":"info","message":"PATH/fb_regular.js:scrapeFeed: Scraped feed: null","timestamp":"2012-09-23T18:35:12.687Z"}

So, as you see, it calls the phantom callback function (second parameter in the evaluate function) with a null argument, but it doesn't execute the first parameter (my evaluator function, which prints iterating step X).

Anyone knows what the problem is?

like image 436
philipDS Avatar asked Sep 23 '12 18:09

philipDS


1 Answers

I'm unsure as to what version of PhantomJS you are using, but as for the documentation of versions 1.6+ logging inside evaluated script will log the result in the contained page. It will not log into your console. To get that you would have to bind logging to the pages onConsoleMessage event:

  page.onConsoleMessage = function (msg) { console.log(msg); };

As for the result not being available: The page.evaluate function takes arguments like so - first one is a function to be executed and the rest are passed as input to that function. The result is returned directly:

 var title = page.evaluate(function (s) {
    return document.querySelector(s).innerText;
 }, 'title');
 console.log(title);
like image 167
DeadAlready Avatar answered Sep 29 '22 12:09

DeadAlready