Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should we use curly brackets { } when piping with dplyr [duplicate]

Tags:

r

magrittr

When using map on a nested data_frame, I do not understand why the latter two version give an error, how should I use the dot (.)?

library(tidyverse)
# dummy data
df <- tibble(id = rep(1:10, each = 10), 
                 val = runif(100))
df <- nest(df, -id)

# works as expected
map(df$data, min)
df %>% .$data %>% map(., min)

# gives an error
df %>% map(.$data, min)
# Error: Don't know how to index with object of type list at level 1

df %>% map(data, min)
like image 463
johannes Avatar asked Feb 22 '17 07:02

johannes


People also ask

What are curly brackets used for in R?

The curly brackets are used to denote a block of code in a function. So, say we need a function to calculate the standard error we might do this. The square brackets are used to subset vectors and data frames.

Why do we use %>% in R?

The pipe operator, written as %>% , has been a longstanding feature of the magrittr package for R. It takes the output of one function and passes it into another function as an argument. This allows us to link a sequence of analysis steps.

What does the pipe operator in the dplyr package do?

What is the Pipe Operator? The pipe operator is a special operational function available under the magrittr and dplyr package (basically developed under magrittr), which allows us to pass the result of one function/argument to the other one in sequence. It is generally denoted by symbol %>% in R Programming.

What does the Tidyverse pipe operator %>% do?

Use %>% to emphasise a sequence of actions, rather than the object that the actions are being performed on. Avoid using the pipe when: You need to manipulate more than one object at a time. Reserve pipes for a sequence of steps applied to one primary object.


1 Answers

The problem isn't map, but rather how the %>% pipe deals with the .. Consider the following examples (remember that / is a two argument function in R):

Simple piping:

1 %>% `/`(2)

Is equivalent to `/`(1, 2) or 1 / 2 and gives 0.5.

It is also equivalent to 1 %>% `/`(., 2).

Simple . use:

1 %>% `/`(2, .)

Is equivalent to `/`(2, 1) or 2 / 1 and gives 2.

You can see that 1 is no longer used as the first argument, but only as the second.

Other . use:

This doesn't work however, when subsetting the .:

list(a = 1) %>% `/`(.$a, 2)
Error in `/`(., .$a, 2) : operator needs one or two arguments

We can see that . got injected twice, as the first argument and subsetted in the second argument. An expression like .$a is sometimes referred to as a nested function call (the $ function is used inside the / function, in this case).

We use braces to avoid first argument injection:

list(a = 1) %>% { `/`(.$a, 2) }

Gives 0.5 again.

Actual problem:

You are actually calling map(df, df$data, min), not map(df$data, min).

Solution:

Use braces:

df %>% { map(.$data, min) }

Also see the header Using the dot for secondary purposes in ?magrittr::`%>%` which reads:

In particular, if the placeholder is only used in a nested function call, lhs will also be placed as the first argument! The reason for this is that in most use-cases this produces the most readable code. For example, iris %>% subset(1:nrow(.) %% 2 == 0) is equivalent to iris %>% subset(., 1:nrow(.) %% 2 == 0) but slightly more compact. It is possible to overrule this behavior by enclosing the rhs in braces. For example, 1:10 %>% {c(min(.), max(.))} is equivalent to c(min(1:10), max(1:10)).

like image 79
Axeman Avatar answered Sep 27 '22 21:09

Axeman