Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to measure the statistical significance of a click through rate?

I'm building an open source project which will measure whether or not the differences in the click through rate of various Facebook advertisements are significantly significant. Taking inspiration from http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=167743 I wrote the following ruby code (presume any methods not defined within do exactly what they say).

The click through rate is defined as the percentage of people who click on an ad compared to the number of people who see an impression of that advert.



  # ** exponentiation
  # * multiplication
  # / division
  def standard_deviation
    (experiment_ctr/(control_ctr**3) * (no_of_clicks_for_control +
                                        no_of_clicks_for_experiment - product_of_ctrs *
                                        total_no_of_impressions ) / product_of_impressions) ** 0.5
  end

 def z_score
   (ratio_of_experiment_ctr_to_control - 1) / standard_deviation
  end


I copied the standard deviation code from the Google website but it looks fishy to me. Has anyone any ideas as to whether this is correct or not?

Much appreciated.

like image 500
Jack Kinsella Avatar asked Nov 06 '22 01:11

Jack Kinsella


1 Answers

It does not look familiar because it is not the ordinary significance tests most people are accustom to seeing. Most significance test are formulated by (grossly over generalized, please no flames):

  1. computing a sample statistic, X
  2. determine the expected value of that statistic, E
  3. determine the standard deviation of that statistic, S
  4. compute the test statistic T = (X - E) / S
  5. determine if T is significant based on the hypothesized distribution of T.

For the common mean significance tests, E is the sample mean and S the sample standard deviation we are most familiar with.

This significance test is based on some ratio for the sample statistic. The (E-C)/C formula provided by Google. This statistic, according to Google, has an expected value of (1 / (1-p)) - 2 and has an standard deviation of (p / ( (C+E) * (1-p)^3 ))^0.5. So, those should be the numbers plugged into the T formula above. The z-score in Google's explanation.

So, even though the formula looks odd, it is based on sound fundamentals. You should be able to use it with confidence.

like image 155
Brent Worden Avatar answered Nov 09 '22 10:11

Brent Worden