Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any difference between numpy.std and excel STDEV function?

I have a list:

s = [0.995537725, 0.994532199, 0.996027983, 0.999891383, 1.004754272, 1.003870012, 0.999888944, 0.994438078, 0.992548715, 0.998344545, 1.004504764, 1.00883411]

where I calculated its standard deviation in Excel, I got the answer: 0.005106477, the function I used was: =STDEV(C5:N5)

Then I do the same calculation using numpy.std as:

import numpy as np

print np.std(s)

However, I got the answer: 0.0048890791894

I even wrote up my own std function:

def std(input_list):
        count = len(input_list)

        mean = float(sum(input_list)) / float(count)

        overall = 0.0
        for i in input_list:
            overall = overall + (i - mean) * (i - mean)

        return math.sqrt(overall / count)

and my own function gives the same result as numpy.

So I am wondering is there such a difference? Or it just I made some mistake?

like image 793
ChangeMyName Avatar asked Dec 07 '15 12:12

ChangeMyName


Video Answer


1 Answers

There's a difference: Excel's STDEV calculates the sample standard deviation, while NumPy's std calculates the population standard deviation by default (it is behaving like Excel's STDEVP).

To make NumPy's std function behave like Excel's STDEV, pass in the value ddof=1:

>>> np.std(s, ddof=1)
0.0051064766704396617

This calculates the standard deviation of s using the sample variance (i.e. dividing by n-1 rather than n.)

like image 167
Alex Riley Avatar answered Oct 09 '22 14:10

Alex Riley