Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reverse PCA in prcomp to get original data

Tags:

r

I want to reverse the PCA calculated from prcomp to get back to my original data.

I thought something like the following would work:

pca$x %*% t(pca$rotation)

but it doesn't.

The following link shows how to get back the original data from PCs, but explains it only for PCA using eigen on the covariance matrix http://www.di.fc.ul.pt/~jpn/r/pca/pca.html

prcomp doesn't calcluate PCs that way.

"The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix." -prcomp

like image 424
Jase Villam Avatar asked Apr 21 '15 21:04

Jase Villam


People also ask

How to use prcomp to do PCA?

We will use prcomp to do PCA. The prcomp function takes in the data as input, and it is highly recommended to set the argument scale=TRUE. This standardize the input data so that it has zero mean and variance one before doing PCA. We have stored the results from prcomp and the resulting object has many useful variables associated with the analysis.

How to do a PCA on a correlation matrix?

When PCA is done on correlation matrix (and not on covariance matrix), the raw data X r a w is not only centered by subtracting μ but also scaled by dividing each column by its standard deviation σ i. In this case, to reconstruct the original data, one needs to back-scale the columns of X ^ with σ i and only then to add back the mean vector μ.

How to calculate PC's in prcomp?

prcomp doesn't calcluate PCs that way. "The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix." -prcomp Show activity on this post.

How do you do dimensionality reduction in PCA?

For the purposes of dimensionality reduction, one can keep only a subset of principal components and discard the rest. (See here for a layman's introduction to PCA .) Let X raw be the n × p data matrix with n rows (data points) and p columns (variables, or features).


1 Answers

prcomp will center the variables so you need to add the subtracted means back

t(t(pca$x %*% t(pca$rotation)) + pca$center)

If pca$scale is TRUE you will also need to re-scale

t(t(pca$x %*% t(pca$rotation)) * pca$scale + pca$center)
like image 169
konvas Avatar answered Sep 27 '22 23:09

konvas