Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I replace one term in an R formula with two?

Tags:

r

r-formula

I have something along the lines of

y ~ x + z

And I would like to transform it to

y ~ x_part1 + x_part2 + z

More generally, I would like to have a function that takes a formula and returns that formula with all terms that match "^x$" replaced by "x_part1" and "x_part2". Here's my current solution, but it just feels so kludgey...

my.formula <- fruit ~ apple + banana
var.to.replace <- 'apple'
my.terms <- labels(terms(my.formula))
new.terms <- paste0('(', 
                    paste0(var.to.replace, 
                           c('_part1', '_part2'),
                           collapse = '+'),
                    ')')
new.formula <- reformulate(termlabels = gsub(pattern = var.to.replace,
                                             replacement = new.terms,
                                             x = my.terms),                                 
                           response = my.formula[[2]])

An additional caveat is that the input formula may be specified with interactions.

y ~ b*x + z

should output one of these (equivalent) formulae

y ~ b*(x_part1 + x_part2) + z
y ~ b + (x_part1 + x_part2) + b:(x_part1 + x_part2) + z
y ~ b + x_part1 + x_part2 + b:x_part1 + b:x_part2 + z

MrFlick has advocated the use of

substitute(y ~ b*x + z, list(x=quote(x_part1 + x_part2)))

but when I have stored the formula I want to modify in a variable, as in

my.formula <- fruit ~ x + banana

This approach seems to require a little more massaging:

substitute(my.formula, list(x=quote(apple_part1 + apple_part2)))
# my.formula

The necessary change to that approach was:

do.call(what = 'substitute',
        args = list(apple, list(x=quote(x_part1 + x_part2))))

But I can't figure out how to use this approach when both 'x' and c('x_part', 'x_part2') are stored in variables with names, e.g. var.to.replace and new.terms above.

like image 519
rcorty Avatar asked Aug 09 '16 15:08

rcorty


People also ask

Is there a replace function in R?

Replacing values in a data frame is a very handy option available in R for data analysis. Using replace() in R, you can switch NA, 0, and negative values with appropriate to clear up large datasets for analysis.

How do you replace a variable in R?

replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.


2 Answers

You can use the substitute function for this

substitute(y ~ b*x + z, list(x=quote(x_part1 + x_part2)))
# y ~ b * (x_part1 + x_part2) + z

Here we use the named list to tell R to replace the variable x with the expression x_part1 + x_part2

like image 124
MrFlick Avatar answered Oct 25 '22 22:10

MrFlick


You can write a recursive function to modify the expression tree of the formula:

replace_term <- function(f, old, new){
  n <- length(f)
  if(n > 1) {
    for(i in 1:n) f[[i]] <- Recall(f[[i]], old, new)

    return(f)
  }

  if(f == old) new else f
}

Which you can use to modify eg interactions:

> replace_term(y~x*a+z - x, quote(x), quote(x1 + x2))
y ~ (x1 + x2) * a + z - (x1 + x2)
like image 29
Neal Fultz Avatar answered Oct 25 '22 22:10

Neal Fultz