Is there a function for substituting (or removing at all) explaining variables in a linear model (lm)?

Question

I have a linear model with lots of explaining variables (independent variables)

model <- lm(y ~ x1 + x2 + x3 + ... + x100)

some of which are linear depended on each other (multicollinearity).

I want the machine to search for the name of the explaining variable which has the highest VIF coefficient (x2 for example), delete it from the formula and then run the old lm function with the new formula

model <- lm(y ~ x1 + x3 + ... + x100)

I already learned how to retrieve the name of the explaining variable which has the highest VIF coefficient:

max_vif <- function(x) {
  vifac <- data.frame(vif(x))
  nameofmax <- rownames(which(vifac == max(vifac), arr.ind = TRUE))
  return(nameofmax)
}

But I still don't understand how to search the needed explaining variable, delete it from the formula and run the function again.

bouncyball · Accepted Answer

We can use the update function and paste in the column that needs to be removed. We first can fit a model, and then use update to change that model's formula. The model formula can be expressed as a character string, which allows you to concatenate the general formula .~. and whatever variable(s) you'd like removed (using the minus sign -).

Here is an example:

fit1 <- lm(wt ~ mpg + cyl + am, data = mtcars)
coef(fit1)

# (Intercept)         mpg         cyl          am 
#  4.83597190 -0.09470611  0.08015745 -0.52182463 

rm_var <- "am"
fit2 <- update(fit1, paste0(".~. - ", rm_var))
coef(fit2)

# (Intercept)         mpg         cyl 
#  5.07595833 -0.11908115  0.08625557

Using max_vif we can wrap this into a function:

rm_max_vif <- function(x){
  # find variable(s) needing to be removed
  rm_var <- max_vif(x)
  # concatenate with "-" to remove variable(s) from formula
  rm_var <- paste(paste0("-", rm_var), collapse = " ")
  # update model
  update(x, paste0(".~.", rm_var))
}

Is there a function for substituting (or removing at all) explaining variables in a linear model (lm)?

Tags:

r

linear-regression

lm

Konstantin M.

1 Answers

bouncyball

Recent Activity

Donate For Us

Is there a function for substituting (or removing at all) explaining variables in a linear model (lm)?

Tags:

r

linear-regression

lm

Konstantin M.

1 Answers

bouncyball

Related questions

Recent Activity

Donate For Us