I would like to write a wrapper around a custom function that takes some vectors as input (like: mtcars$hp
, mtcars$am
etc.) to take input as data frame name (as data
parameter, eg.: mtcars
) and variable names (like: hp
and am
), as usual in most standard function.
But I have some problems, my proposed 'demo' function (a wrapper around mean
does not work.
Code:
f <- function(x, data=NULL) {
if (!missing(data)) {
with(data, mean(x))
} else {
mean(x)
}
}
Running against a vector works of course:
> f(mtcars$hp)
[1] 146.69
But with
fails unfortunatelly:
> f(hp, mtcars)
Error in with(d, mean(x)) : object 'hp' not found
While in global environment/without my custom function works right:
> with(mtcars, mean(hp))
[1] 146.69
I have tried to do some experiment with substitute
, deparse
and others, but without any success. Any hint would be welcomed!
Here's the key piece of the puzzle:
f <- function(x,data=NULL) {
eval(match.call()$x,data) # this is mtcars$hp, so just take the mean of it or whatever
}
> f(hp,mtcars)
[1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52 65 97 150 150 245 175 66
[27] 91 113 264 175 335 109
# it even works without a data.frame specified:
> f(seq(10))
[1] 1 2 3 4 5 6 7 8 9 10
See @Andrie's link to @Hadley's document for an explanation of why it works. See @Hadley's note for a critical caveat: f() cannot be run from inside another function.
Basically R uses lazy evaluation (e.g. it doesn't evaluate things until they're actually used). So you can get away with passing it hp
because it remains an unevaluated symbol until it appears somewhere. Since match.call
grabs it as a symbol and waits to evaluate it, all is well.
Then eval
evaluates it in the specified environment. According to ?eval
, the second argument represents:
The environment in which expr is to be evaluated. May also be NULL, a list, a data frame, a pairlist or an integer as specified to sys.call.
Therefore you're in good shape with either NULL (if you're not passing a data.frame) or a data.frame.
Proof of lazy evaluation is that this doesn't return an error (since x is never used in the function):
> g <- function(x) {
+ 0
+ }
> g(hp)
[1] 0
f <- function(x, data=NULL) {
if (!missing(data)) { colname=deparse(substitute(x))
mean(data[[colname]])
} else {
mean(x)
}
}
f(hp, mtcars)
[1] 146.6875
(Admittedly not as compact as @gsk's and I think I will try to remember his method over mine. And thanks to Josh O'Brien for pointing out an error that's now been fixed.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With