Let's say I have a data matrix d
pc = prcomp(d)
# pc1 and pc2 are the principal components
pc1 = pc$rotation[,1]
pc2 = pc$rotation[,2]
Then this should fit the linear regression model right?
r = lm(y ~ pc1+pc2)
But then I get this error :
Errormodel.frame.default(formula = y ~ pc1+pc2, drop.unused.levels = TRUE) :
unequal dimensions('pc1')
I guess there a packages out there who do this automatically, but this should work too?
The most common methods involve trying to make the residuals, the deviations of the data points from the estimated regression line, as small as possible. In the case of only two data points, our regression line passes through both points, so the residuals are zero--the data points do not deviate from the line.
PCA in linear regression has been used to serve two basic goals. The first one is performed on datasets where the number of predictor variables is too high. It has been a method of dimensionality reduction along with Partial Least Squares Regression.
Answer: you don't want pc$rotation, it's the rotation matrix and not the matrix of rotated values (scores).
Make up some data:
x1 = runif(100)
x2 = runif(100)
y = rnorm(2+3*x1+4*x2)
d = cbind(x1,x2)
pc = prcomp(d)
dim(pc$rotation)
## [1] 2 2
Oops. The "x" component is what we want. From ?prcomp:
x: if ‘retx’ is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the ‘rotation' matrix) is returned.
dim(pc$x)
## [1] 100 2
lm(y~pc$x[,1]+pc$x[,2])
##
## Call:
## lm(formula = y ~ pc$x[, 1] + pc$x[, 2])
## Coefficients:
## (Intercept) pc$x[, 1] pc$x[, 2]
## 0.04942 0.14272 -0.13557
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With