Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove variable wrapped in function from model formula in R

Tags:

regex

r

I have a model with transformed variables, e.g.:

 data = data.frame(y = runif(100,0,10), x1 = runif(100,0,10), x2 =   runif(100, 0, 10))
 mod = lm(y ~ scale(x1) + scale(x2), data)

I would like to remove one entire variable from the formula, like so:

mod = lm(y ~ scale(x1), # x2 is gone! 
 data)

But I would like to do this using a user-supplied character string of the variable to be removed (in other words, I'm wrapping this in a function and its not feasible to edit the formula by hand, as I have here).

If the variable was untransformed, this would be simple using gsub:

remove.var = "x2"
update(mod, formula. = as.formula(gsub(remove.var, "", format(formula(mod)))))

but as such, it returns the wholly predictable error:

 > Error in as.matrix(x) : argument "x" is missing, with no default

because scale() is still in the formula!

Is there a way to do this with regexpr, or some way that I am not seeing that is totally obvious? I would like it to be scalable to other types of transformations, e.g.: log, log10,etc.

As another layer of complexity, suppose that the variable to be removed also appeared in an interaction:

 mod = lm(y ~ scale(x1) * scale(x2), data)

In this case, one would have to remove the interaction * as well (errant +s, I have found, are ok).

Any help is much appreciated. Thanks!

like image 961
jslefche Avatar asked Nov 12 '14 22:11

jslefche


People also ask

How do I remove a variable from a model in R?

In order to remove multiple variables together to fit a model, we can use a combination function along with mathematical operators so as to remove two or more than two variables together. For example, we can use subtraction operator inside aggregate function to remove multiple variables for the creation of a model.

How do you unassign a variable in R?

When you want to clear a single variable from the R environment you can use the “rm()” command followed by the variable you want to remove. variable: that variable name you want to remove.


1 Answers

A terms-object is a formula with additional attributes:

update(mod, formula=drop.terms(mod$terms, 2, keep.response=TRUE)  )

Call:
lm(formula = y ~ scale(x1), data = data)

Coefficients:
(Intercept)    scale(x1)  
     5.0121       0.1236  

If you need to calculate that position from a string argument, then you can grep the term.labels attribute:

> grep( "x2", attr( mod$terms, "term.labels") )
[1] 2

Notice that this also succeeds with the interaction formula:

update(mod, formula=drop.terms(mod$terms, grep( "x2", attr( mod$terms, "term.labels") ), keep.response=TRUE) )
#----------

Call:
lm(formula = y ~ scale(x1), data = data)

Coefficients:
(Intercept)    scale(x1)  
     5.0121       0.1236  
like image 56
IRTFM Avatar answered Oct 12 '22 01:10

IRTFM