Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hide the footprint of CasperJS with Google Analytics

I'm trying to hide the utilisation of CasperJS with one my script. Currently I'm trying to change the resolution, the user agent and the language with that :

casper.userAgent("My UA");
casper.viewport(1600, 900);
casper.page.customHeaders = {'Accept-Language': 'fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3'};

The casper.viewport() and casper.page.customHeaders doesn't seem to work with Google Analytics... When trying on some websites, it seems to be okay, but Google Analytics gets to see I'm a web scraper :

My lang is "c"
Compatibility with JAVA : no
Screen resolution : 1024x768
Flash version : not set

Is there anything I can do to fake this ?

(Piece of) Solution

Thanks to kasper pedersen, here is a part of the solution :

We can override some variable during the initialize part with :

casper.on('page.initialized', function (page) {
    page.evaluate(function () { 
        (function() {
            window.screen = {
                width: 1600,
                height: 900
            };
            window.navigator.__defineGetter__('javaEnabled', function () {
                return function() { return true; };
            });
        })
    });
});

This fakes the windows resolution and the plugin for Java. To fake the flash plugins, we could do something like that :

casper.on('page.initialized', function (page) {
page.evaluate(function () { 
    (function() {
        window.screen = {
            width: 1600,
            height: 900
        };
        var fake_navigator = {};
        for (var i in navigator) {
          fake_navigator[i] =  navigator[i];
        }
        fake_navigator.javaEnabled = function() { return true; };
        fake_navigator.language = 'en-US';
        fake_navigator.plugins = {
            length: 1,
            'Shockwave Flash': {
                description: 'Shockwave Flash 11.9 r900',
                name: 'Shockwave Flash',
                version: '11.9.900.117'
            }
        };
        window.navigator = fake_navigator;
    })();
});
});

When doing this and verifying the info in the window.navigator of PhantomJS, everything seems okay, but Google Analytics doesn't track me as a visitor any more (I don't appear in the real time window of Google Analytics...).

So I just fake the first two info, and for the language, I fake it in changing the language of my server (export LC_ALL=en_US.utf8).

like image 439
Kevin Avatar asked Oct 27 '13 10:10

Kevin


1 Answers

The language settings is a bit odd, but the screen resolution is probably "wrong" because you're setting the viewport, not the screen res. AFAIK Google Analytics uses the window.screen-object.

I haven't worked with CasperJS, but in Phantom you could do this:

page.onInitialized = function () {
    page.evaluate(function () {
    window.screen = {
            width: 1600,
            height: 900
        }
    });
};

I think Java is checked using navigator.javaEnabled() and Flash is looked up in navigator.plugins, so something similar could be done for Flash and Java.

like image 158
Kasper Pedersen Avatar answered Sep 26 '22 08:09

Kasper Pedersen