Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

`With` usage inside function (wrapper)

I would like to write a wrapper around a custom function that takes some vectors as input (like: mtcars$hp, mtcars$am etc.) to take input as data frame name (as data parameter, eg.: mtcars) and variable names (like: hp and am), as usual in most standard function.

But I have some problems, my proposed 'demo' function (a wrapper around mean does not work.

Code:

f <- function(x, data=NULL) {
    if (!missing(data)) {
        with(data, mean(x))
    } else {
        mean(x)
    }
}

Running against a vector works of course:

> f(mtcars$hp)
[1] 146.69

But with fails unfortunatelly:

> f(hp, mtcars)
Error in with(d, mean(x)) : object 'hp' not found

While in global environment/without my custom function works right:

> with(mtcars, mean(hp))
[1] 146.69

I have tried to do some experiment with substitute, deparse and others, but without any success. Any hint would be welcomed!

like image 338
daroczig Avatar asked Dec 05 '11 15:12

daroczig


2 Answers

Here's the key piece of the puzzle:

f <- function(x,data=NULL) {
  eval(match.call()$x,data) # this is mtcars$hp, so just take the mean of it or whatever
}

> f(hp,mtcars)
 [1] 110 110  93 110 175 105 245  62  95 123 123 180 180 180 205 215 230  66  52  65  97 150 150 245 175  66
[27]  91 113 264 175 335 109

# it even works without a data.frame specified:
> f(seq(10))
 [1]  1  2  3  4  5  6  7  8  9 10

See @Andrie's link to @Hadley's document for an explanation of why it works. See @Hadley's note for a critical caveat: f() cannot be run from inside another function.

Basically R uses lazy evaluation (e.g. it doesn't evaluate things until they're actually used). So you can get away with passing it hp because it remains an unevaluated symbol until it appears somewhere. Since match.call grabs it as a symbol and waits to evaluate it, all is well.

Then eval evaluates it in the specified environment. According to ?eval, the second argument represents:

The environment in which expr is to be evaluated. May also be NULL, a list, a data frame, a pairlist or an integer as specified to sys.call.

Therefore you're in good shape with either NULL (if you're not passing a data.frame) or a data.frame.

Proof of lazy evaluation is that this doesn't return an error (since x is never used in the function):

> g <- function(x) {
+   0
+ }
> g(hp)
[1] 0
like image 186
Ari B. Friedman Avatar answered Nov 03 '22 17:11

Ari B. Friedman


f <- function(x, data=NULL) {
    if (!missing(data)) { colname=deparse(substitute(x))
         mean(data[[colname]])
    } else {
        mean(x)
    }
}

 f(hp, mtcars)
[1] 146.6875

(Admittedly not as compact as @gsk's and I think I will try to remember his method over mine. And thanks to Josh O'Brien for pointing out an error that's now been fixed.)

like image 27
IRTFM Avatar answered Nov 03 '22 16:11

IRTFM