Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using predict with a list of lm() objects

Tags:

r

lm

plyr

predict

I have data which I regularly run regressions on. Each "chunk" of data gets fit a different regression. Each state, for example, might have a different function that explains the dependent value. This seems like a typical "split-apply-combine" type of problem so I'm using the plyr package. I can easily create a list of lm() objects which works well. However I can't quite wrap my head around how I use those objects later to predict values in a separate data.frame.

Here's a totally contrived example illustrating what I'm trying to do:

# setting up some fake data
set.seed(1)
funct <- function(myState, myYear){
   rnorm(1, 100, 500) +  myState + (100 * myYear) 
}
state <- 50:60
year <- 10:40
myData <- expand.grid( year, state)
names(myData) <- c("year","state")
myData$value <- apply(myData, 1, function(x) funct(x[2], x[1]))
## ok, done with the fake data generation. 

require(plyr)

modelList <- dlply(myData, "state", function(x) lm(value ~ year, data=x))
## if you want to see the summaries of the lm() do this:  
    # lapply(modelList, summary)

state <- 50:60
year <- 50:60
newData <- expand.grid( year, state)
names(newData) <- c("year","state") 
## now how do I predict the values for newData$value 
   # using the regressions in modelList? 

So how do I use the lm() objects contained in modelList to predict values using the year and state independent values from newData?

like image 829
JD Long Avatar asked Dec 13 '11 22:12

JD Long


People also ask

How does predict LM work?

predict. lm produces a vector of predictions or a matrix of predictions and bounds with column names fit , lwr , and upr if interval is set. For type = "terms" this is a matrix with a column per term and may have an attribute "constant" .

What is predict () in R?

The predict() function is used to predict the values based on the previous data behaviors and thus by fitting that data to the model.

What is Newdata in predict in R?

newdata An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used.

How do you find the predicted value in a linear regression?

How to Use a Linear Regression Model to Calculate a Predicted Response Value. Step 1: Identify the independent variable x . Step 2: Calculate the predicted response value ^y by plugging in the given x value into the least-squares linear regression line ^y(x)=ax+b y ^ ( x ) = a x + b .


3 Answers

Here's my attempt:

predNaughty <- ddply(newData, "state", transform,
  value=predict(modelList[[paste(piece$state[1])]], newdata=piece))
head(predNaughty)
#   year state    value
# 1   50    50 5176.326
# 2   51    50 5274.907
# 3   52    50 5373.487
# 4   53    50 5472.068
# 5   54    50 5570.649
# 6   55    50 5669.229
predDiggsApproved <- ddply(newData, "state", function(x)
  transform(x, value=predict(modelList[[paste(x$state[1])]], newdata=x)))
head(predDiggsApproved)
#   year state    value
# 1   50    50 5176.326
# 2   51    50 5274.907
# 3   52    50 5373.487
# 4   53    50 5472.068
# 5   54    50 5570.649
# 6   55    50 5669.229

JD Long edit

I was inspired enough to work out an adply() option:

pred3 <- adply(newData, 1,  function(x)
    predict(modelList[[paste(x$state)]], newdata=x))
head(pred3)
#   year state        1
# 1   50    50 5176.326
# 2   51    50 5274.907
# 3   52    50 5373.487
# 4   53    50 5472.068
# 5   54    50 5570.649
# 6   55    50 5669.229
like image 134
Joshua Ulrich Avatar answered Oct 17 '22 23:10

Joshua Ulrich


You need to use mdply to supply both the model and the data to each function call:

dataList <- dlply(newData, "state")

preds <- mdply(cbind(mod = modelList, df = dataList), function(mod, df) {
  mutate(df, pred = predict(mod, newdata = df))
})
like image 7
hadley Avatar answered Oct 17 '22 23:10

hadley


A solution with just base R. The format of the output is different, but all the values are right there.

models <- lapply(split(myData, myData$state), 'lm', formula = value ~ year)
pred4  <- mapply('predict', models, split(newData, newData$state))
like image 6
Ramnath Avatar answered Oct 17 '22 21:10

Ramnath