The standard cov
function calculates the sample covariance matrix, I want to have the population covariance matrix.
I tried the following:
cov.pop <- function(x,y=NULL) {
cov(x,y)*(length(x)-1)/length(x)
}
> sapply(list(Apple,HP,Microsoft),cov.pop,y=Apple) #correct
[1] 0.7861672 0.1363396 0.2223303
> sapply(list(Apple,HP,Microsoft),cov.pop,y=HP) #correct
[1] 0.13633964 0.09560376 0.05226032
> sapply(list(Apple,HP,Microsoft),cov.pop,y=Microsoft) #correct
[1] 0.22233028 0.05226032 0.13519964
> cov.pop(cbind(Apple,HP,Microsoft)) #not correct
Apple HP Microsoft
Apple 0.8444018 0.14643887 0.23879919
HP 0.1464389 0.10268552 0.05613145
Microsoft 0.2387992 0.05613145 0.14521443
My question
Is there a simple way to modify the cov.pop
function to get the correct population covariance matrix?
The population covariance between and is obtained by summing over all pairs of variables. We then multiply respective coefficients from the two linear combinations as times times the covariances between j and k. We can then estimate the population covariance by using the sample covariance.
The only difference in formula for Population Covariance and Sample Covariance lies in the fact that Population Covariance is calculated over the entire dataset(N) whereas Sample Covariance is calculated over a sample (N-1), so that the denominator of the Population Covariance is 1 larger than that of the Sample ...
To create a Covariance matrix from a data frame in the R Language, we use the cov() function. The cov() function forms the variance-covariance matrix. It takes the data frame as an argument and returns the covariance matrix as result.
I guess the results are different because the length
in the matrix
(i.e. cbind(Apple, HP, Microsoft)
and the length
in each list
element is not the same
cov.pop <- function(x,y=NULL) {
cov(x,y)*(NROW(x)-1)/NROW(x)
}
Using an example dataset
set.seed(24)
Apple <- rnorm(140)
HP <- rnorm(140)
Microsoft <- rnorm(140)
cov.pop(cbind(Apple,HP,Microsoft))
# Apple HP Microsoft
#Apple 0.946489639 0.006511604 0.02518080
#HP 0.006511604 1.015532869 0.04940075
#Microsoft 0.025180805 0.049400745 1.08388185
sapply(list(Apple,HP,Microsoft),cov.pop,y=Apple)
#[1] 0.946489639 0.006511604 0.025180805
sapply(list(Apple,HP,Microsoft),cov.pop,y=HP)
#[1] 0.006511604 1.015532869 0.049400745
sapply(list(Apple,HP,Microsoft),cov.pop,y=Microsoft)
#[1] 0.02518080 0.04940075 1.08388185
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With