I have something along the lines of
y ~ x + z
And I would like to transform it to
y ~ x_part1 + x_part2 + z
More generally, I would like to have a function that takes a formula and returns that formula with all terms that match "^x$" replaced by "x_part1" and "x_part2". Here's my current solution, but it just feels so kludgey...
my.formula <- fruit ~ apple + banana
var.to.replace <- 'apple'
my.terms <- labels(terms(my.formula))
new.terms <- paste0('(',
paste0(var.to.replace,
c('_part1', '_part2'),
collapse = '+'),
')')
new.formula <- reformulate(termlabels = gsub(pattern = var.to.replace,
replacement = new.terms,
x = my.terms),
response = my.formula[[2]])
An additional caveat is that the input formula may be specified with interactions.
y ~ b*x + z
should output one of these (equivalent) formulae
y ~ b*(x_part1 + x_part2) + z
y ~ b + (x_part1 + x_part2) + b:(x_part1 + x_part2) + z
y ~ b + x_part1 + x_part2 + b:x_part1 + b:x_part2 + z
MrFlick has advocated the use of
substitute(y ~ b*x + z, list(x=quote(x_part1 + x_part2)))
but when I have stored the formula I want to modify in a variable, as in
my.formula <- fruit ~ x + banana
This approach seems to require a little more massaging:
substitute(my.formula, list(x=quote(apple_part1 + apple_part2)))
# my.formula
The necessary change to that approach was:
do.call(what = 'substitute',
args = list(apple, list(x=quote(x_part1 + x_part2))))
But I can't figure out how to use this approach when both 'x' and c('x_part', 'x_part2') are stored in variables with names, e.g. var.to.replace
and new.terms
above.
Replacing values in a data frame is a very handy option available in R for data analysis. Using replace() in R, you can switch NA, 0, and negative values with appropriate to clear up large datasets for analysis.
replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.
You can use the substitute
function for this
substitute(y ~ b*x + z, list(x=quote(x_part1 + x_part2)))
# y ~ b * (x_part1 + x_part2) + z
Here we use the named list to tell R to replace the variable x
with the expression x_part1 + x_part2
You can write a recursive function to modify the expression tree of the formula:
replace_term <- function(f, old, new){
n <- length(f)
if(n > 1) {
for(i in 1:n) f[[i]] <- Recall(f[[i]], old, new)
return(f)
}
if(f == old) new else f
}
Which you can use to modify eg interactions:
> replace_term(y~x*a+z - x, quote(x), quote(x1 + x2))
y ~ (x1 + x2) * a + z - (x1 + x2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With