I use <code>R</code> only a little bit and never use data frames, which makes understanding the correct use of predict difficult. I have my data in plain matrices, not data frames, call them <code>a</code> and <code>b</code>, which are <code>N x p</code> and <code>M x p</code> matrices respectively. I can run the regression <code>lm(a[,1] ~ a[,-1])</code>. I would like to use the resulting <code>lm</code> object to predict <code>b[,1]</code> from <code>b[,-1]</code>. My naive guess of <code>predict(lm(a[,1] ~ a[,-1]), b[,-1])</code> doesn't work. What's the right syntax to use the <code>lm</code> to get a vector of predictions?

You can store a whole matrix in one column of a <code>data.frame</code>: <pre class="prettyprint"><code>x <- a [, -1] y <- a [, 1] data <- data.frame (y = y, x = I (x)) str (data) ## 'data.frame': 10 obs. of 2 variables: ## $ y: num 0.818 0.767 -0.666 0.788 -0.489 ... ## $ x: AsIs [1:10, 1:9] 0.916274.... 0.386565.... 0.703230.... -2.64091.... 0.274617.... ... model <- lm (y ~ x) newdata <- data.frame (x = I (b [, -1])) predict (model, newdata) ## 1 2 ## -3.795722 -4.778784 </code></pre> The paper about the pls package, (Mevik, B.-H. and Wehrens, R. The pls Package: Principal Component and Partial Least Squares Regression in R Journal of Statistical Software, 2007, 18, 1 - 24.) explains this technique. Another example with a spectroscopic data set (quinine fluorescence), is in <code>vignette ("flu")</code> of my package hyperSpec.

Using lm and predict on data in matrices

Tags:

dataframe

r

lm

predict

I use R only a little bit and never use data frames, which makes understanding the correct use of predict difficult. I have my data in plain matrices, not data frames, call them a and b, which are N x p and M x p matrices respectively. I can run the regression lm(a[,1] ~ a[,-1]). I would like to use the resulting lm object to predict b[,1] from b[,-1]. My naive guess of predict(lm(a[,1] ~ a[,-1]), b[,-1]) doesn't work. What's the right syntax to use the lm to get a vector of predictions?

218

asked Mar 07 '13 14:03

pythonic metaphor

1 Answers

You can store a whole matrix in one column of a data.frame:

x <- a [, -1]
y <- a [,  1]
data <- data.frame (y = y, x = I (x))
str (data)
## 'data.frame':    10 obs. of  2 variables:
## $ y: num  0.818 0.767 -0.666 0.788 -0.489 ...
## $ x: AsIs [1:10, 1:9] 0.916274.... 0.386565.... 0.703230.... -2.64091.... 0.274617.... ...

model <- lm (y ~ x)
newdata <- data.frame (x = I (b [, -1]))
predict (model, newdata) 
##         1         2 
## -3.795722 -4.778784

The paper about the pls package, (Mevik, B.-H. and Wehrens, R. The pls Package: Principal Component and Partial Least Squares Regression in R Journal of Statistical Software, 2007, 18, 1 - 24.) explains this technique.

Another example with a spectroscopic data set (quinine fluorescence), is in vignette ("flu") of my package hyperSpec.

answered Sep 19 '22 23:09

cbeleites unhappy with SX

Related questions
                            
                                how can I normalize data frame values by the sum (get percents)
                            
                                Generating samples from a two-Gaussian mixture in r (code given in MATLAB)
                            
                                Defining dependent and independent variables dynamically in the ezANOVA function
                            
                                to.minutes using custom endpoints
                            
                                Remove rows of a data set belonging to a factor of specified length
                            
                                Define a matrix in R and pass it to C++
                            
                                Defining default field values for instances of S4 Reference Classes
                            
                                lapply over nested list and retain naming/structure
                            
                                Populate a column using if statements in r
                            
                                Rstudio editor snippets
                            
                                Homebrew R build missing Cairo
                            
                                How to Vectorize this R code Using Plyr, Apply, or Similar?
                            
                                Compute the time since the beginning of the week?
                            
                                R zero or one based? [closed]
                            
                                Fastest way to split character vectors into new rows in a dataframe
                            
                                Stacked bar plot in R with multiple rows per day
                            
                                R: Calculate the mean value of a variable by unique values of another variable in a dataframe?
                            
                                Compile r with mkl (With mulithreads support)
                            
                                Plotting decision trees in R with rpart
                            
                                Subtraction on different rows and columns and separated by group

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With