Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding T-SQL stdev, stdevp, var, and varp

I'm having a difficult time understand what these statistics functions do and how they work. I'm having an even more difficult time understanding how stdev works vs stdevp and the var equivelant. Can someone please break these down into dumb for me?

like image 921
James Alexander Avatar asked Sep 21 '09 16:09

James Alexander


1 Answers

In statistics Standard Deviation and Variance are measures of how much a metric in a population deviate from the mean (usually the average.) The Standard Deviation is defined as the square root of the Variance and the Variance is defined as the average of the squared difference from the mean, i.e.:

For a population of size n: x1, x2, ..., xn with mean: xmean

Stdevp = sqrt( ((x1-xmean)^2 + (x2-xmean)^2 + ... + (xn-xmean)^2)/n )

When values for the whole population are not available (most of the time) it is customary to apply Bessel's correction to get a better estimate of the actual standard deviation for the whole population. Bessel's correction is merely dividing by n-1 instead of by n when computing the variance, i.e:

Stdev = sqrt( ((x1-xmean)^2 + (x2-xmean)^2 + ... + (xn-xmean)^2)/(n-1) )

Note that for large enough data-sets it won't really matter which function is used.

You can verify my answer by running the following T-SQL script:

-- temporary data set with values 2, 3, 4
declare @t table([val] int);

insert into @t values
    (2),(3),(4);

select avg(val) as [avg], -- equals to 3.0
   -- Estimation of the population standard devisation using a sample and Bessel's Correction:
   -- ((x1 - xmean)^2 + (x2 - xmean)^2 + ... + (xn-xmean)^2)/(n-1)  
   stdev(val) as [stdev],
   sqrt( (square(2-3.0) + square(3-3) + square(4-3))/2) as [stdev calculated], -- calculated with value 2, 3, 4

   -- Population standard deviation: 
   -- ((x1 - xmean)^2 + (x2 - xmean)^2 + ... + (xn-xmean)^2)/n
   stdevp(val) as [stdevp],
   sqrt( (square(2-3.0) + square(3-3) + square(4-3))/3) as [stdevp calculated] -- calculated with value 2, 3, 4
from @t;

Further reading wikipedia articles for: standard deviation and Bessel's Correction.

like image 79
Eli Algranti Avatar answered Jan 03 '23 18:01

Eli Algranti