I use R
only a little bit and never use data frames, which makes understanding the correct use of predict difficult. I have my data in plain matrices, not data frames, call them a
and b
, which are N x p
and M x p
matrices respectively. I can run the regression lm(a[,1] ~ a[,-1])
. I would like to use the resulting lm
object to predict b[,1]
from b[,-1]
. My naive guess of predict(lm(a[,1] ~ a[,-1]), b[,-1])
doesn't work. What's the right syntax to use the lm
to get a vector of predictions?
predict. lm produces a vector of predictions or a matrix of predictions and bounds with column names fit , lwr , and upr if interval is set. For type = "terms" this is a matrix with a column per term and may have an attribute "constant" .
The function lm is the workshorse for fitting linear models. It takes as input a formula: suppose you have a data frame containing columns x (a regressor) and y (the regressand); you can then call lm(y ~ x) to fit the linear model y=β0+β1x+ε y = β 0 + β 1 x + ε .
The predict() function is used to predict the values based on the previous data behaviors and thus by fitting that data to the model. You can also use the confidence intervals to check the accuracy of our predictions.
The lm() function is used to fit linear models to data frames in the R Language. It can be used to carry out regression, single stratum analysis of variance, and analysis of covariance to predict the value corresponding to data that is not in the data frame.
You can store a whole matrix in one column of a data.frame
:
x <- a [, -1]
y <- a [, 1]
data <- data.frame (y = y, x = I (x))
str (data)
## 'data.frame': 10 obs. of 2 variables:
## $ y: num 0.818 0.767 -0.666 0.788 -0.489 ...
## $ x: AsIs [1:10, 1:9] 0.916274.... 0.386565.... 0.703230.... -2.64091.... 0.274617.... ...
model <- lm (y ~ x)
newdata <- data.frame (x = I (b [, -1]))
predict (model, newdata)
## 1 2
## -3.795722 -4.778784
The paper about the pls package, (Mevik, B.-H. and Wehrens, R. The pls Package: Principal Component and Partial Least Squares Regression in R Journal of Statistical Software, 2007, 18, 1 - 24.) explains this technique.
Another example with a spectroscopic data set (quinine fluorescence), is in vignette ("flu")
of my package hyperSpec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With