Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Analytics Realtime Sandbox Environment

I am looking for a way to setup a google analytics sandbox environment that will allow me to test out my custom js code near real time.

My app will be using custom variables for advanced segmentation, and I would like to test out multiple scenarios quickly, as opposed to setting up a dummy GA account and wait for a whole day to confirm the test.

Thanks

like image 202
Salman Paracha Avatar asked Jul 24 '10 18:07

Salman Paracha


People also ask

What is Google Analytics sandbox?

Analytics Academy Course Staff A “sandbox” account is just another Google Analytics account that you'll set up for this course in order to practice and test account settings without having to use your existing account. We want you to have an environment to practice in where you're not afraid to make mistakes!

Does Google Analytics support real-time data?

Real-Time is available in all Analytics accounts. No changes to the tracking code are necessary. To see Real-Time: Sign in to Google Analytics..

How do you test Google Analytics on staging?

If you use a staging website You can copy-paste the Google Analytics tracking code provided by your test property on all pages of your staging website and then duplicate all of the existing website tracking currently being deployed via live GA property into the test property.

How long does it take for Google Analytics to show real-time data?

You'll need to wait 24-48 hours after setting up your Google Analytics property to see your data. While you're waiting, you can verify your tracking code is doing its job by checking your real-time reports and debug view or looking at your website source code.


1 Answers

Great question.

For GA, server updates occur every four hours, and after every sixth such update, the entire set is recalculated, which means a 24-hour lag from code change to reliable feedback. This delay also applies to most customizations to the GA Browser (e.g., "custom filters").

So if you are going to use GA as your web metrics system, and you expect to actually rely on those data then a test rig is essential.

For me, it's useful to group test systems for client-side analytics using two rubrics: (i) complete, self-contained (closed-loop) systems; or (ii) simpler automated data pulls from the production system (by "production system" here i mean GA's system, not the Site whose pages the GA code is tracking).

For the latter, just add this line to each page of your Site that contains the GA tracking code, just below '__trackPageview()':

pageTracker._setLocalRemoteServerMode();

That line will cause a copy of each transaction line to be logged to your server's activity log--so in essence, you get the data captured by GA in real-time That's all you need to do to capture the data; to parse it, you can use, for instance, any of the excellent open source web log analyzers like AWStats, or roll your own.

This is simple and reliable--but all it can do is tell you (in real-time) "does the analytics code i just implemented on pages served by my production server actually work?"

Usually, that's not good enough--you would rather know if your code will work before it's on your production server. To do that, you need to simulate the production environment and find a way to access in real-time the data GA collects.

This kind of test rig is a little more involved, but still not difficult.

In sum, it requires these steps:

  1. host/serve the ga.js and the tracking pixel locally;

  2. log the __utm.gif requests (in the GA data flow, each request corresponds to one logged transaction); and

  3. parse the headers into some convenient human-readable form.


If you want more detail than that (ie, a step-by-step implementation), here it is:

I. Hosting/Serving the GA Script (& automating updates

To do that, you can create a small shell script like this one to wget the latest ga.js version into your local directory (replacing the extant version it finds there).

#!/bin/sh
rm /My_Sites/sitename.com/analytics/ga.js
cd /My_Sites/sitename.com/analytics/
wget http://www.google-analytics.com/ga.js
chmod 644 /My_Sites/sitename.com/analytics/ga.js
cd ${OLDPWD}
exit 0;

(Thanks to AskApache.com, which provided the original motivation and config details to do this in a production context.)


II. Create __utm.gif file

This is just a transparent 1x1 pixel gif image, which you will place in Site directory (doesn't matter where, it just needs to match the location recited in your pages)


III. Log the __utm.gif Requests

For a testing protocol in which you are the source of the client-side activity (e.g., you want to verify the cross-browser fidelity of some event-tracking code you've added to a page on your Site, so you automate 5000 clicks on the button you just wired up,serving the page from your dev server set up for this purpose) it's probably simplest to just log the Request Headers, because it's in those headers that the GA script directs the client to gather various data from the DOM, from the location bar (url), and from prior http headers, and append them to a request for a resource on the GA server (__utm.gif, which is just a 1x1 transparent pixel).

For this type of protocol, i use the Firefox addon, LiveHTTPHeaders. You install it like any other Firefox addon, a few mouse clicks is all. Next, open it, and click the "Generator" tab. From this window, you can see the actual requests in real time. At the bottom of the window is a 'save' button to store the log. I find it easier to configure LiveHTTPHeaders to log only the __utm.gif requests; to do that, just click the 'Edit' tab and create a siimple filter to exclude everything except these particular gif images (using the check boxes on the right, and the large text box to the right).

Other kinds of test protocols require you to work from your Server Activity Logs; in that case just add this line to each page of your Site, just below __trackPageview():

pageTracker._setLocalRemoteServerMode();

IV. Parse those logged requests so you can actually read them

So now your log will contain individual transction lines, each one of which is a string appended to an HTTP Request for the GA tracking pixel. This string is just a concatenation of key-value pairs, each key begins with the letters "utm" (probably for "urchin tracker"). Each of these parameters corresponds to a variable that you see in the GA Dashboard (here's a complete list and description of them). This is all you need to know to build a parser. In more detail:

First, here's a sanitized __utm.gif request (the entries in your LiveHTTPHeaders log):

http://www.google-analytics.com/__utm.gif?utmwv=1&utmn=1669045322&utmcs=UTF-8&utmsr=1280x800&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=10.0%20r45&utmcn=1&utmdt=Position%20Listings%20%7C%20Linden%20Lab&utmhn=lindenlab.hrmdirect.com&utmr=http://lindenlab.com/employment&utmp=/employment/openings.php?sort=da&&utmac=UA-XXXXXX-X&utmcc=__utma%3D87045125.1669045322.1274256051.1274256051.1274256051.1%3B%2B__utmb%3D87045125%3B%2B__utmc%3D87045125%3B%2B__utmz%3D87045125.1274256051.1.1.utmccn%3D(referral)%7Cutmcsr%3Dlindenlab.com%7Cutmcct%3D%2Femployment%7Cutmcmd%3Dreferral%3B%2B

This is my parser (in Python):

# regular expression module imported
import re

pattern = r'\&{1,2}'
pat_obj = re.compile(pattern)

# splitting the gif request on the '&' character 
# (which GA originally used to concatenate each piece to build the request)
# (here, i've bound the __utm.gif to the variable by 'gfx')
gfx1 = pat_obj.split(gfx)

# create a look-up table to map a descriptive name to each gif request parameter
# (note, this isn't the entire list, which i've linked to above)
keys = "utmje utmsc utmsr utmac utmcc utmcn utmcr utmcs utmdt utme utmfl utmhn utmn utmp utmr utmul utmwv"
values = "java_enabled screen_color_depth screen_resolution account_string cookies campaign_session_new repeat_campaign_visit language_encoding page_title event_tracking_data flash_version host_name GIF_req_unique_id page_request referral_url browser_language gatc_version"
keys = keys.strip().split()

#create the look-up table
GIF_REQUEST_PARAMS = dict(zip(keys, values))

# parse each request parameter and map the parameter name to a descriptive name:
pattern = r'(utm\w{1,2})=(.*?)$'
pat_obj = re.compile(pattern)

for itm in gfx1 :
    m = pat_obj.search(itm)
    if m :
        fmt = '{0:25} {1:10}'
        print( fmt.format( GIF_REQUEST_PARAMS[m.group(1)], m.group(2) ) )

The result looks like this:

    gatc_version              1         
    GIF_req_unique_id         1669045322
    language_encoding         UTF-8     
    screen_resolution         1280x800  
    screen_color_depth        24-bit    
    browser_language          en-us     
    java_enabled              1         
    flash_version             10.0%20r45
    campaign_session_new      1         
    page_title                Position%20Listings%20%7C%20Linden%20Lab
    host_name                 lindenlab.hrmdirect.com
    referral_url              http://lindenlab.com/employment
    page_request              /employment/openings.php?sort=da
    account_string            UA-XXXXXX-X
    cookies

To avoid making this longer still, i left out the cookies' value. They obviously require a separate parsing step, though it's virtually identical to the step i just showed. Again, each request represents a single transaction, so you can store them as you need to.

like image 149
doug Avatar answered Oct 10 '22 22:10

doug