Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use dplyr/magrittr's pipe inside functions in R?

I'm trying to write a function which takes as argument a dataframe and the name of the function. When I try to write the function with the standard R syntax, I can get the good result using eval and substitute as recommanded by @hadley in http://adv-r.had.co.nz/Computing-on-the-language.html

> df <- data.frame(y = 1:10)
> f <- function(data, x) {
+   out <- mean(eval(expr = substitute(x), envir = data))
+   return(out)
+ }
> f(data = df, x = y)
[1] 5.5

Now, when I try to write the same function using the %>% operator, it doesn't work :

> df <- data.frame(y = 1:10)
> f <- function(data, x) {
+   data %>% 
+     eval(expr = substitute(x), envir = .) %>% 
+     mean()
+ }
> f(data = df, x = y)
Show Traceback
Rerun with Debug
 Error in eval(expr, envir, enclos) : objet 'y' introuvable 
> 

How can I using the combine the piping operator with the use of eval and substitute ? It's seems really tricky for me.

like image 534
PAC Avatar asked Feb 11 '16 17:02

PAC


People also ask

Can you pipe in a function R?

Pipes are an extremely useful tool from the magrittr package 1 that allow you to express a sequence of multiple operations. They can greatly simplify your code and make your operations more intuitive. However they are not the only way to write your code and combine multiple operations.

How does the %>% pipe work in R?

What does the pipe do? The pipe operator, written as %>% , has been a longstanding feature of the magrittr package for R. It takes the output of one function and passes it into another function as an argument. This allows us to link a sequence of analysis steps.

What does %>% mean in Dplyr?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).

How do you type a pipe operator in R?

In RStudio the keyboard shortcut for the pipe operator %>% is Ctrl + Shift + M (Windows) or Cmd + Shift + M (Mac). In RStudio the keyboard shortcut for the assignment operator <- is Alt + - (Windows) or Option + - (Mac).


2 Answers

A workaround would be

f <- function(data, x) {
  v <- substitute(x)
  data %>% 
    eval(expr = v, envir = .) %>%
    mean()
}

The problem is that the pipe functions (%>%) are creating another level of closure which interferes with the evaluation of substitute(x). You can see the difference with this example

df <- data.frame(y = 1:10)
f1 <- function(data, x) {
  print(environment())
  eval(expr = environment(), envir = data)
}

f2 <- function(data, x) {
  print(environment())
  data %>% 
    eval(expr = environment(), envir = .)
}
f1(data = df, x = y)
# <environment: 0x0000000006388638>
# <environment: 0x0000000006388638>
f2(data = df, x = y)
# <environment: 0x000000000638a4a8>
# <environment: 0x0000000005f91ae0>

Notice how the environments differ in the matrittr version. You want to take care of substitute stuff as soon as possible when mucking about with non-standard evaluation.

I hope your use case is a bit more complex than your example, because it seems like

mean(df$y)

would be a much easier bit of code to read.

like image 143
MrFlick Avatar answered Oct 07 '22 08:10

MrFlick


I've been trying to understand my problem.

First, I've written what I want with the summarise() function :

> library(dplyr)
> df <- data.frame(y = 1:10)
> summarise_(.data = df, mean = ~mean(y))
  mean
1  5.5

Then I try to program my own function. I've found a solution which seems to work with the lazyeval package in this post. I use the lazy() and the interp() functions to write what I want.

The first possibility is here :

> library(lazyeval)
> f <- function(data, col) {
+   col <- lazy(col)
+   inter <- interp(~mean(x), x = col)
+   summarise_(.data = data, mean = inter)    
+   }
> f(data = df, col = y)
  mean
1  5.5

I can also use pipes :

> f <- function(data, col) {
+   col <- lazy(col)
+   inter <- interp(~mean(x), x = col)
+   data %>% 
+     summarise_(.data = ., mean = inter)    
+ }
> 
> f(data = df, col = y)
  mean
1  5.5
like image 3
PAC Avatar answered Oct 07 '22 08:10

PAC