Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to fit a linear regression model with two principal components in R?

Let's say I have a data matrix d

pc = prcomp(d)

# pc1 and pc2 are the principal components  
pc1 = pc$rotation[,1] 
pc2 = pc$rotation[,2]

Then this should fit the linear regression model right?

r = lm(y ~ pc1+pc2)

But then I get this error :

Errormodel.frame.default(formula = y ~ pc1+pc2, drop.unused.levels = TRUE) : 
   unequal dimensions('pc1')

I guess there a packages out there who do this automatically, but this should work too?

like image 764
phpdash Avatar asked Nov 26 '09 18:11


People also ask

Can you do linear regression with two data points?

The most common methods involve trying to make the residuals, the deviations of the data points from the estimated regression line, as small as possible. In the case of only two data points, our regression line passes through both points, so the residuals are zero--the data points do not deviate from the line.

Can you use PCA for linear regression?

PCA in linear regression has been used to serve two basic goals. The first one is performed on datasets where the number of predictor variables is too high. It has been a method of dimensionality reduction along with Partial Least Squares Regression.

1 Answers

Answer: you don't want pc$rotation, it's the rotation matrix and not the matrix of rotated values (scores).

Make up some data:

x1 = runif(100)
x2 = runif(100)
y = rnorm(2+3*x1+4*x2)
d = cbind(x1,x2)

pc = prcomp(d)
## [1] 2 2

Oops. The "x" component is what we want. From ?prcomp:

x: if ‘retx’ is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the ‘rotation' matrix) is returned.

## [1] 100   2
## Call:
## lm(formula = y ~ pc$x[, 1] + pc$x[, 2])

## Coefficients:
## (Intercept)    pc$x[, 1]    pc$x[, 2]  
##     0.04942      0.14272     -0.13557  
like image 198
Ben Bolker Avatar answered Oct 14 '22 15:10

Ben Bolker