Why do numpy cov diagonal elements and var functions have different values?

Tags:

python

numpy

In [127]: x = np.arange(2)

In [128]: np.cov(x,x)
Out[128]:
array([[ 0.5,  0.5],
       [ 0.5,  0.5]])

In [129]: x.var()
Out[129]: 0.25

Why is this the behavior? I believe that covariance matrix diagonal elements should be the variance of the series.

749

asked Jan 09 '14 20:01

zsljulius

1 Answers

In numpy, cov defaults to a "delta degree of freedom" of 1 while var defaults to a ddof of 0. From the notes to numpy.var

Notes
-----
The variance is the average of the squared deviations from the mean,
i.e.,  ``var = mean(abs(x - x.mean())**2)``.

The mean is normally calculated as ``x.sum() / N``, where ``N = len(x)``.
If, however, `ddof` is specified, the divisor ``N - ddof`` is used
instead.  In standard statistical practice, ``ddof=1`` provides an
unbiased estimator of the variance of a hypothetical infinite population.
``ddof=0`` provides a maximum likelihood estimate of the variance for
normally distributed variables.

So you can get them to agree by taking:

In [69]: cov(x,x)#defaulting to ddof=1
Out[69]: 
array([[ 0.5,  0.5],
       [ 0.5,  0.5]])

In [70]: x.var(ddof=1)
Out[70]: 0.5

In [71]: cov(x,x,ddof=0)
Out[71]: 
array([[ 0.25,  0.25],
       [ 0.25,  0.25]])

In [72]: x.var()#defaulting to ddof=0
Out[72]: 0.25

answered Oct 26 '22 03:10

mmdanziger

Related questions
                            
                                Python 2.7.3 + OpenCV 2.4 after rotation window doesn't fit Image
                            
                                capturing dis.dis results
                            
                                Django: [email protected] in admin
                            
                                What Is The Cleanest Way to Call A Python Function From C++ with a SWIG Wrapped Object
                            
                                Django multi-table inheritance, how to know which is the child class of a model?
                            
                                What does matplotlib `imshow(interpolation='nearest')` do?
                            
                                AuthAlreadyAssociated Exception in Django Social Auth
                            
                                matplotlib very slow. Is it normal?
                            
                                How is super() in Python 3 implemented?
                            
                                finding multiples of a number in Python
                            
                                How to add inline comments to multiline string assignments in python
                            
                                How are import statements in plpython handled?
                            
                                pycurl https error: unable to get local issuer certificate
                            
                                A Python "catch all" method for undefined/unimplemented attributes in classes
                            
                                Fixing "warning: GMP or MPIR library not found; Not building Crypto.PublickKey._fastmath" error on Python 2.7 with CentOS 6.4
                            
                                enforce column encoding with sqlalchemy
                            
                                UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 0: ordinal not in range(128)
                            
                                How to calculate auto-covariance in Python
                            
                                Is deleteLater() necessary in PyQt/PySide?
                            
                                Python - Decorators

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With