In the past, when working with a data frame and wanting to get a single column as a vector, I would use magrittr::extract2()
like this:
mtcars %>%
mutate(wt_to_hp = wt/hp) %>%
extract2('wt_to_hp')
But I've seen that dplyr::pull()
and purrr::pluck()
also exists to do much the same job: return a single vector from a data frame, not unlike [[
.
Assuming that I'm always loading all 3 libraries for any project I work on, what are the advantages and use cases of each of these 3 functions? Or more specifically, what distinguishes them from each other?
When you "should" use a function is really a matter of personal preference. Which function expresses your intention most clearly. There are differences between them. For example, pluck
works better when you want to do multiple extractions. From help file:
accessor(x[[1]])$foo
# is the same as
pluck(x, 1, accessor, "foo")
so while it can be use to just extract a column, it's useful when you have more deeply nested structures or you want to compose with an accessor function.
The pull
function is meant to blend in with the result of the dplyr
function. It can take the name of a column using any of the ways you can with other functions in the package. For example it will work with !!
style expansion where say extract2
will not.
irispull <- function(x) {
iris %>% pull(!!enquo(x))
}
irispull(Sepal.Length)
And extract2
is nothing more than a "more readable" wrapper for the base function [[
. In fact it's defined as .Primitive("[[")
so it expects column names as character or column indexes and integers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With