Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get expression that evaluated to dot in function called by `magrittr` pipe

I have a function x_expression() which prints the expression passed to argument x.

pacman::p_load(magrittr, rlang)

x_expression <- function(x) {
  print(enquo(x))
}

y <- 1

x_expression(y)
#> <quosure>
#>   expr: ^y
#>   env:  global

y %>% x_expression()
#> <quosure>
#>   expr: ^.
#>   env:  0x7ff27c36a610

So you can see that it knows y was passed to it, but when y is piped in with %>%, the function returns prints .. Is there a way to recover the y in the case that it is piped in, or is it gone forever? In brief, what I want is a function like x_expression() but one that would print y in both cases above.

This question is indeed similar to Get name of dataframe passed through pipe in R, however it is slightly more general. This person just wants the name of the data frame, I want the expression, whatever it is. However, the same answer will likely apply to both. I don't like the answer of this near-duplicate question, nor does the author of that answer.

like image 555
Rory Nolan Avatar asked Aug 28 '18 20:08

Rory Nolan


People also ask

What does Magrittr mean?

magrittr is the package home to the %>% pipe operator written by Stefan Milton Bache and used throughout the tidyverse. This last and likely final version of magrittr has been completely rewritten in C to resolve the longstanding issues of overhead and backtrace footprint.

What is the pipe function in R?

What does the pipe do? The pipe operator, written as %>% , has been a longstanding feature of the magrittr package for R. It takes the output of one function and passes it into another function as an argument. This allows us to link a sequence of analysis steps.

Why is it called Magrittr?

Originally from the magrittr package, it's now used in many other packages as well. (If you're wondering where the magrittr name came from, it's a reference to Belgian artist Rene Magritte and one of his paintings, The Treachery of Images, that says in French: “This is not a pipe.”)


1 Answers

y is not "gone forever", because the pipe calls your function, and it also knows about y. There's a way to recover y, but it requires some traversal of the calling stack. To understand what's happening, we'll use ?sys.frames and ?sys.calls:

‘sys.calls’ and ‘sys.frames’ give a pairlist of all the active calls and frames, respectively, and ‘sys.parents’ returns an integer vector of indices of the parent frames of each of those frames.

If we sprinkle these throughout your x_expression(), we can see what happens when we call y %>% x_expression() from the global environment:

x_expression <- function(x) {
  print( enquo(x) )
  # <quosure>
  #   expr: ^.
  #   env:  0x55c03f142828                <---

  str(sys.frames())
  # Dotted pair list of 9
  #  $ :<environment: 0x55c03f151fa0> 
  #  $ :<environment: 0x55c03f142010> 
  #  ...
  #  $ :<environment: 0x55c03f142828>     <---
  #  $ :<environment: 0x55c03f142940>

  str(sys.calls())
  # Dotted pair list of 9
  #  $ : language y %>% x_expression()    <---
  #  $ : language withVisible(eval(...
  #  ...
  #  $ : language function_list[[k]...
  #  $ : language x_expression(.)
}

I highlighted the important parts with <---. Notice that the quosure captured by enquo lives in the parent environment of the function (second from the bottom of the stack), while the pipe call that knows about y is all the way at the top of the stack.

There's a couple of ways to traverse the stack. @MrFlick's answer to a similar question as well as this GitHub issue traverse the frames / environments from sys.frames(). Here, I will show an alternative that traverses sys.calls() and parses the expressions to find %>%.

The first piece of the puzzle is to define a function that converts an expression to its Abstract Sytax Tree(AST):

# Recursively constructs Abstract Syntax Tree for a given expression
getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)
# Example: getAST( quote(a %>% b) )
# List of 3
#  $ : symbol %>%
#  $ : symbol a
#  $ : symbol b

We can now systematically apply this function to the entire sys.calls() stack. The goal is to identify ASTs where the first element is %>%; the second element will then correspond to the left-hand side of the pipe (symbol a in the a %>% b example). If there is more than one such AST, then we're in a nested %>% pipe scenario. In this case, the last AST in the list will be the lowest in the calling stack and closest to our function.

x_expression2 <- function(x) {
  sc <- sys.calls()
  ASTs <- purrr::map( as.list(sc), getAST ) %>%
    purrr::keep( ~identical(.[[1]], quote(`%>%`)) )  # Match first element to %>%

  if( length(ASTs) == 0 ) return( enexpr(x) )        # Not in a pipe
  dplyr::last( ASTs )[[2]]    # Second element is the left-hand side
}

(Minor note: I used enexpr() instead of enquo() to ensure consistent behavior of the function in and out of the pipe. Since sys.calls() traversal returns an expression, not a quosure, we want to do the same in the default case as well.)

The new function is pretty robust and works inside other functions, including nested %>% pipes:

x_expression2(y)
# y

y %>% x_expression2()
# y

f <- function() {x_expression2(v)}
f()
# v

g <- function() {u <- 1; u %>% x_expression2()}
g()
# u

y %>% (function(z) {w <- 1; w %>% x_expression2()})  # Note the nested pipes
# w
like image 156
Artem Sokolov Avatar answered Oct 23 '22 02:10

Artem Sokolov