Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Principal component analysis in R with prcomp and by myself: different results

Tags:

r

pca

Where do I am wrong? I am trying to perform PCA through prcomp and by myself, and I get different results, can you please help me?

DOING IT BY MYSELF:

>database <- read.csv("E:/R/database.csv", sep=";", dec=",") #it's a 105 rows x 8 columns, each column is a variable
>matrix.cor<-cor(database)
>standardize<-function(x) {(x-mean(x))/sd(x)}
>values.standard<-apply(database, MARGIN=2, FUN=standardize)
>my.eigen<-eigen(matrix.cor)
>loadings<-my.eigen$vectors
>scores<-values.standard %*% loadings
>head (scores, n=10) # I m just posting here the first row scores for the first 6 pc

[,1]       [,2]       [,3]        [,4]       [,5]        [,6]        

2.3342586  2.3426398 -0.9169527  0.80711713  1.1409138 -0.25832090    

>sd <-sqrt (my.eigen$values)
>sd

[1] 1.5586078 1.1577093 1.1168477 0.9562853 0.8793033 0.8094500 0.6574788
0.4560247

DOING IT WITH PRCOMP:

>database.pca<-prcomp(database, retx=TRUE, center= TRUE, scale=TRUE)
>sd1<-database.pca$sdev 
>loadings1<-database.pca$rotation
>rownames(loadings1)<-colnames(database)
>scores1<-database.pca$x
>head (scores1, n=10)
PC1        PC2        PC3         PC4        PC5         PC6       
-2.3342586  2.3426398  0.9169527  0.80711713  1.1409138  0.25832090

range (scores-scores1) is not zero! Please help me!!! Gloria

like image 466
Gloria Dalla Costa Avatar asked Jan 29 '13 22:01

Gloria Dalla Costa


People also ask

What values does Prcomp () function return?

The prcomp function returns an object of class prcomp, which have some methods available. The print method returns the standard deviation of each of the four PCs, and their rotation (or loadings), which are the coefficients of the linear combinations of the continuous variables.

What is the difference between Prcomp and Princomp in R?

The function princomp() uses the spectral decomposition approach. The functions prcomp() and PCA()[FactoMineR] use the singular value decomposition (SVD). According to the R help, SVD has slightly better numerical accuracy. Therefore, the function prcomp() is preferred compared to princomp().

What is the primary disadvantage with principal component analysis?

Low interpretability of principal components. Principal components are linear combinations of the features from the original data, but they are not as easy to interpret. For example, it is difficult to tell which are the most important features in the dataset after computing principal components.


1 Answers

It looks like your principal component scores have come out more or less exactly the same, just with different signs. As I learned here, the sign of a principal component is basically arbitrary.

If you test your manually calculated scores with something like range(abs(scores) - abs(scores1)) instead, you should get something pretty close to 0 (maybe not exactly 0, due to possible floating-point precision effects).

like image 116
Marius Avatar answered Sep 18 '22 12:09

Marius