Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manually set coefficient for new factor level when predicting

Tags:

r

I have a linear model where one of the independent variables is a factor and where I am trying to make predictions on a data set that contains a new factor level (a factor level that wasn't in the data set the model was estimated on). I want to be able to make predictions for the observations with the new factor level by manually specifying the coefficient that will be applied to the factor. For example, suppose I estimate daily sales volumes for three types of stores, and I introduce a fourth type of store into the dataset. I have no historical data for it, but I might assume it will behave like some weighted combination of the other stores, for whom I have model coefficients.

If I try to apply predict.lm() to the new data I will get an error telling me that the factor has new levels (this makes sense).

df <- data.frame(y=rnorm(100), x1=factor(rep(1:4,25)))
lm1 <- lm(y ~ x1, data=df)
newdata <- data.frame(y=rnorm(100), x1=factor(rep(1:5,20)))
predict(lm1, newdata)

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  factor x2 has new levels 5

I could do the prediction manually by simply multiplying the coefficients by the individual columns in the data.frame. However, this is cumbersome given that the real model I'm working with has many variables and interaction terms, and I want to be able to easily cycle through various model specifications by changing the model formula. Is there a way for me to essentially add a new coefficient to a model object and then use it to make forecasts? If not, is there another approach that is less cumbersome than setting up the entire prediction step manually?

like image 682
Abiel Avatar asked Aug 19 '13 00:08

Abiel


1 Answers

Assumming you want level 5 to be evenly weighted, you can convert to a matrix, plug in the 25%, and multiply it by the coefficients from the model...

n.mat <- model.matrix(~x1, data=newdata)
n.mat[n.mat[,5] == 1, 2:4] <- .25
n.mat <- n.mat[,-5]
n.prediction <- n.mat %*% coef(lm1)
like image 84
Neal Fultz Avatar answered Oct 21 '22 14:10

Neal Fultz