Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Constructing scores from princomp loadings in R

Tags:

r

pca

princomp

I would like to be able to construct the scores of a principal component analysis using its loadings, but I cannot figure out what the princomp function is actually doing when it computes the scores of a dataset. A toy example:

cc <- matrix(1:24,ncol=4)
PCAcc <- princomp(cc,scores=T,cor=T)
PCAcc$loadings

Loadings:
     Comp.1 Comp.2 Comp.3 Comp.4
[1,]  0.500  0.866              
[2,]  0.500 -0.289  0.816       
[3,]  0.500 -0.289 -0.408 -0.707
[4,]  0.500 -0.289 -0.408  0.707

PCAcc$scores

       Comp.1        Comp.2        Comp.3 Comp.4
[1,] -2.92770 -6.661338e-16 -3.330669e-16      0
[2,] -1.75662 -4.440892e-16 -2.220446e-16      0
[3,] -0.58554 -1.110223e-16 -6.938894e-17      0
[4,]  0.58554  1.110223e-16  6.938894e-17      0
[5,]  1.75662  4.440892e-16  2.220446e-16      0
[6,]  2.92770  6.661338e-16  3.330669e-16      0

My understanding is that the scores are a linear combination of the loadings and the original data rescaled. Trying by "hand":

rescaled <- t(t(cc)-apply(cc,2,mean))
rescaled%*%PCAcc$loadings

     Comp.1        Comp.2        Comp.3 Comp.4
[1,]     -5 -1.332268e-15 -4.440892e-16      0
[2,]     -3 -6.661338e-16 -3.330669e-16      0
[3,]     -1 -2.220446e-16 -1.110223e-16      0
[4,]      1  2.220446e-16  1.110223e-16      0
[5,]      3  6.661338e-16  3.330669e-16      0
[6,]      5  1.332268e-15  4.440892e-16      0

The columns are off by a factor of 1.707825, 2, and 1.333333, respectively. Why is this? Since the toy data matrix has the same variance in each column, normalization shouldn't be necessary here. Any help is greatly appreciated.

Thanks!

like image 260
Escotch Avatar asked Jun 01 '13 06:06

Escotch


People also ask

How do you interpret PCA loads in R?

Positive loadings indicate a variable and a principal component are positively correlated: an increase in one results in an increase in the other. Negative loadings indicate a negative correlation. Large (either positive or negative) loadings indicate that a variable has a strong effect on that principal component.

How do you interpret PCA negative loadings?

Negative correlations among variables and negative loadings do not cause any specific concerns in PCA. In the interpretation of PCA, a negative loading simply means that a certain characteristic is lacking in a latent variable associated with the given principal component.

What is the difference between Prcomp and Princomp in R?

The function princomp() uses the spectral decomposition approach. The functions prcomp() and PCA()[FactoMineR] use the singular value decomposition (SVD). According to the R help, SVD has slightly better numerical accuracy. Therefore, the function prcomp() is preferred compared to princomp().

How do you explain PCA loadings?

PCA loadings are the coefficients of the linear combination of the original variables from which the principal components (PCs) are constructed.


1 Answers

You need

scale(cc,PCAcc$center,PCAcc$scale)%*%PCAcc$loadings

or easier

predict(PCAcc,newdata=cc)
like image 95
Ian Fellows Avatar answered Sep 23 '22 22:09

Ian Fellows