I'm trying to write a function which takes as argument a dataframe and the name of the function. When I try to write the function with the standard R syntax, I can get the good result using eval
and substitute
as recommanded by @hadley in http://adv-r.had.co.nz/Computing-on-the-language.html
> df <- data.frame(y = 1:10)
> f <- function(data, x) {
+ out <- mean(eval(expr = substitute(x), envir = data))
+ return(out)
+ }
> f(data = df, x = y)
[1] 5.5
Now, when I try to write the same function using the %>%
operator, it doesn't work :
> df <- data.frame(y = 1:10)
> f <- function(data, x) {
+ data %>%
+ eval(expr = substitute(x), envir = .) %>%
+ mean()
+ }
> f(data = df, x = y)
Show Traceback
Rerun with Debug
Error in eval(expr, envir, enclos) : objet 'y' introuvable
>
How can I using the combine the piping operator with the use of eval
and substitute
? It's seems really tricky for me.
Pipes are an extremely useful tool from the magrittr package 1 that allow you to express a sequence of multiple operations. They can greatly simplify your code and make your operations more intuitive. However they are not the only way to write your code and combine multiple operations.
What does the pipe do? The pipe operator, written as %>% , has been a longstanding feature of the magrittr package for R. It takes the output of one function and passes it into another function as an argument. This allows us to link a sequence of analysis steps.
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
In RStudio the keyboard shortcut for the pipe operator %>% is Ctrl + Shift + M (Windows) or Cmd + Shift + M (Mac). In RStudio the keyboard shortcut for the assignment operator <- is Alt + - (Windows) or Option + - (Mac).
A workaround would be
f <- function(data, x) {
v <- substitute(x)
data %>%
eval(expr = v, envir = .) %>%
mean()
}
The problem is that the pipe functions (%>%
) are creating another level of closure which interferes with the evaluation of substitute(x)
. You can see the difference with this example
df <- data.frame(y = 1:10)
f1 <- function(data, x) {
print(environment())
eval(expr = environment(), envir = data)
}
f2 <- function(data, x) {
print(environment())
data %>%
eval(expr = environment(), envir = .)
}
f1(data = df, x = y)
# <environment: 0x0000000006388638>
# <environment: 0x0000000006388638>
f2(data = df, x = y)
# <environment: 0x000000000638a4a8>
# <environment: 0x0000000005f91ae0>
Notice how the environments differ in the matrittr version. You want to take care of substitute
stuff as soon as possible when mucking about with non-standard evaluation.
I hope your use case is a bit more complex than your example, because it seems like
mean(df$y)
would be a much easier bit of code to read.
I've been trying to understand my problem.
First, I've written what I want with the summarise()
function :
> library(dplyr)
> df <- data.frame(y = 1:10)
> summarise_(.data = df, mean = ~mean(y))
mean
1 5.5
Then I try to program my own function. I've found a solution which seems to work with the lazyeval
package in this post. I use the lazy()
and the interp()
functions to write what I want.
The first possibility is here :
> library(lazyeval)
> f <- function(data, col) {
+ col <- lazy(col)
+ inter <- interp(~mean(x), x = col)
+ summarise_(.data = data, mean = inter)
+ }
> f(data = df, col = y)
mean
1 5.5
I can also use pipes :
> f <- function(data, col) {
+ col <- lazy(col)
+ inter <- interp(~mean(x), x = col)
+ data %>%
+ summarise_(.data = ., mean = inter)
+ }
>
> f(data = df, col = y)
mean
1 5.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With