Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What distinguishes dplyr::pull from purrr::pluck and magrittr::extract2?

Tags:

r

tidyverse

In the past, when working with a data frame and wanting to get a single column as a vector, I would use magrittr::extract2() like this:

mtcars %>%
  mutate(wt_to_hp = wt/hp) %>%
  extract2('wt_to_hp')

But I've seen that dplyr::pull() and purrr::pluck() also exists to do much the same job: return a single vector from a data frame, not unlike [[.

Assuming that I'm always loading all 3 libraries for any project I work on, what are the advantages and use cases of each of these 3 functions? Or more specifically, what distinguishes them from each other?

like image 432
crazybilly Avatar asked Jan 09 '19 15:01

crazybilly


1 Answers

When you "should" use a function is really a matter of personal preference. Which function expresses your intention most clearly. There are differences between them. For example, pluck works better when you want to do multiple extractions. From help file:

 accessor(x[[1]])$foo 
 # is the same as
 pluck(x, 1, accessor, "foo")

so while it can be use to just extract a column, it's useful when you have more deeply nested structures or you want to compose with an accessor function.

The pull function is meant to blend in with the result of the dplyr function. It can take the name of a column using any of the ways you can with other functions in the package. For example it will work with !! style expansion where say extract2 will not.

irispull <- function(x) {
  iris %>% pull(!!enquo(x))
}
irispull(Sepal.Length)

And extract2 is nothing more than a "more readable" wrapper for the base function [[. In fact it's defined as .Primitive("[[") so it expects column names as character or column indexes and integers.

like image 163
MrFlick Avatar answered Oct 24 '22 11:10

MrFlick