I am trying to get the final model using backward elimination with R but I got the following error message when I ran the code. Could anyone please help me this?
base<-lm(Eeff~NDF,data=phuong)
fullmodel<-lm(Eeff~NDF+ADF+CP+NEL+DMI+FCM,data=phuong)
step(full, direction = "backward", trace=FALSE )
> Error in step(full, direction = "backward", trace = FALSE) :
number of rows in use has changed: remove missing values?
When comparing different submodels, it is necessary that they be fitted to the same set of data -- otherwise the results just don't make sense. (Consider the extreme situation where you have two predictors A
and B
, which are each measured on only half of your observations -- then the model y~A+B
will be fitted to all the data, but the models y~A
and y~B
will be fitted to non-overlapping subsets of the data.) Thus, step
won't allow you to compare submodels that (because of automatic removal of cases containing NA
values) are using different subsets of the original data set.
Using na.omit
on the original data set should fix the problem.
fullmodel <- lm(Eeff ~ NDF + ADF + CP + NEL + DMI + FCM, data = na.omit(phuong))
step(fullmodel, direction = "backward", trace=FALSE )
However, if you have a lot of NA
values in different predictors, you may end up losing a lot of your data set -- in an extreme case you could lose the entire data set. If this happens you have to reconsider your modeling strategy ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With