Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Google Analytics first-party cookies

I have a bit of a trouble understanding cookies used by Google Analytics. I understand that the tracking code included in the website collects certain information about page hits, the duration of the visit, cookies of the first-party domain etc. It then attaches all of this information into the querystring parameters of a gif pixel request, and sends it to the Google Analytics servers. However, what I do not understand is that how can Google Analytics make any sense of these first-party cookies since the pixel is a request to www.google-analytics.com, a third-party server.

So, even though the tracking code itself can be inside the publisher's code, and be executed as the first-party, and therefore allow Google Analytics to have access to the first-party cookies, but when these cookies eventually reach the GA servers, these servers cannot really read the cookies, can they? One explanation for this could be that once the first-party cookies are shared with GA, then regardless of where these cookies are sent (and regardless of the fact that that particular pixel is setting up a third-party cookie inside the browser because of it being a request to www.google-analytics.com), the GA servers are able to have a unique id corresponding to that user (based on the first-party cookie id), and thus maintain a record for that user on subsequent requests. Is this understanding correct?

Could anyone please help clarify this. Thank you.

like image 324
QPTR Avatar asked Nov 10 '16 04:11

QPTR


1 Answers

The current version of Google Analytics uses a single cookie for tracking purposes (and might use others for throtteling or experiments).

These are first party cookies via an injected script because third party cookies have a higher chance of being rejected. However being first party cookies on you own domain they do not reach the Google Analytics server at all (at least not as part of the http headers).

The cookie is used on the client side only to maintain a client id that allows to stitch pageviews into sessions and users. The information from the cookie is then read via JavaScript and appended to the request to the tracking server.

Historically though the cookie information was indeed intended to be send to the server: Urchin, the product that was later acquired by Google and turned into Google Analytics, was originally a logfile analyzer that augmnented server logfiles with a cookie:

The UTM, or Urchin Traffic Monitor, was an early method for augmenting Apache (or IIS, etc.) log files with cookies, such that unique visitors could be established. This method entailed a line of javascript in the of each page on the site, and a small modification to the webserver’s logging behavior. Most of our competitors at the time used either logs only (old school) or javascript/cookies only (WebSideStory, etc.), and both necessarily missed out on a lot of available information. Urchin was the first to use both data sources in one unified collection method, neatly contained in augmented access-log files. Nowadays pretty much everything you’d want can be had via the cookie method (á la GA), but analyzing logs still has its advantages.

So back then the cookies where actually meant to be used for serverside analyses. Today the serverside aspect is just a side effect, the actual use is in client side code.

like image 54
Eike Pierstorff Avatar answered Oct 17 '22 22:10

Eike Pierstorff