I assume numpy.cov(X)
computes the sample covariance matrix as:
1/(N-1) * Sum (x_i - m)(x_i - m)^T (where m is the mean)
I.e sum of outer products. But nowhere in the documentation does it actually say this, it just says "Estimate a covariance matrix".
Can anyone confirm whether this is what it does internally? (I know I can change the constant out the front with the bias
parameter.)
As you can see looking at the source, in the simplest case with no masks, and N
variables with M
samples each, it returns the (N, N)
covariance matrix calculated as:
(x-m) * (x-m).T.conj() / (N - 1)
Where the *
represents the matrix product[1]
Implemented roughly as:
X -= X.mean(axis=0)
N = X.shape[1]
fact = float(N - 1)
return dot(X, X.T.conj()) / fact
If you want to review the source, look here instead of the link from Mr E unless you're interested in masked arrays. As you mentioned, the documentation isn't great.
[1] which in this case is effectively (but not exactly) the outer product because (x-m)
has N
column vectors of length M
and thus (x-m).T
is as many row vectors. The end result is the sum of all the outer products. The same *
will give the inner (scalar) product if the order is reversed. But, technically these are both just standard matrix multiplications and the true outer product is only the product of a column vector onto a row vector.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With