I have a simple problem for you purrr-experts out there that has eluded my best googling efforts for some time. First, let's take a look at the nested-list data structure I'm trying to work with.
#R version 3.4.1
library(purrr) # version 0.2.4
library(dplyr) # version 0.7.4
f1 <- function(a, b, c) {a + b^c}
f2 <- function(x) {x * 2}
f3 <- function(y, z) {y * z}
These are to be passed through to each of f1
, f2
, and f3
:
p1 <- data_frame(a = c(2, 4, 5, 7, 8),
b = c(1, 1, 2, 2, 2),
c = c(.5, 5, 1, 2, 3))
p2 <- data_frame(x = c(1, 4))
p3 <- data_frame(y = c(2, 2, 2, 3),
z = c(5, 4, 3, 2))
I am trying to keep my data wieldy, in a nice, neat rectangle. The "id" variable is the function name itself (in my real data, there are hundreds of these):
df <- data_frame(fun_id = c('f1', 'f2', 'f3'),
params = list(p1, p2, p3),
funs = list(f1, f2, f3))
Checking the structure shows us the list-columns for params
and funs
:
print(df)
# A tibble: 3 x 3
fun_id params funs
<chr> <list> <list>
1 f1 <tibble [5 x 3]> <fun>
2 f2 <tibble [2 x 1]> <fun>
3 f3 <tibble [4 x 2]> <fun>
Using purrr
functions and perhaps dplyr::mutate
, how do I get a new list-column in df
called results
in which each element is a list containing the outputs of executing the functions in funs
with parameters taken from params
, in a rowwise fashion?
I can get pmap
to do what I want for a simple case:
> pmap(.l = p1, .f = f1)
[[1]]
[1] 3
[[2]]
[1] 5
[[3]]
[1] 7
[[4]]
[1] 11
[[5]]
[1] 16
But I really want to do this inside a data frame to keep everything straight. The following gets me to the right structure (a data frame with a list-column for the results), but only for one row and it's not generalized:
> df %>%
slice(1) %>%
mutate(results = list(pmap(.l = params[[1]], .f = funs[[1]])))
# A tibble: 1 x 4
fun_id params funs results
<chr> <list> <list> <list>
1 f1 <tibble [5 x 3]> <fun> <list [5]>
Thanks in advance for the help rounding out my problem!
P.S. I have looked at the following resources, but haven't found an answer yet:
purrr::pmap with dplyr::mutate
Using purrr::pmap within mutate to create list-column
http://statwonk.com/purrr.html
https://github.com/rstudio/cheatsheets/raw/master/purrr.pdf
https://jennybc.github.io/purrr-tutorial/index.html
There is a convenience function in purrr
for exactly this situation; applying a list of functions to a corresponding list of parameters! It's called invoke_map
and can be used with mutate
as below. I think the main advantage over map2(~pmap())
is that if there are additional parameters to supply to any of the functions not included in params
you can add them as named arguments in ...
instead of needing to modify params
.
library(tidyverse)
f1 <- function(a, b, c) {a + b^c}
f2 <- function(x) {x * 2}
f3 <- function(y, z) {y * z}
p1 <- data_frame(
a = c(2, 4, 5, 7, 8),
b = c(1, 1, 2, 2, 2),
c = c(.5, 5, 1, 2, 3)
)
p2 <- data_frame(x = c(1, 4))
p3 <- data_frame(
y = c(2, 2, 2, 3),
z = c(5, 4, 3, 2)
)
df <- data_frame(
fun_id = c("f1", "f2", "f3"),
params = list(p1, p2, p3),
funs = list(f1, f2, f3)
)
df2 <- df %>%
mutate(results = invoke_map(.f = funs, .x = params))
df2
#> # A tibble: 3 x 4
#> fun_id params funs results
#> <chr> <list> <list> <list>
#> 1 f1 <tibble [5 x 3]> <fn> <dbl [5]>
#> 2 f2 <tibble [2 x 1]> <fn> <dbl [2]>
#> 3 f3 <tibble [4 x 2]> <fn> <dbl [4]>
df2$results
#> [[1]]
#> [1] 3 5 7 11 16
#>
#> [[2]]
#> [1] 2 8
#>
#> [[3]]
#> [1] 10 8 6 6
Created on 2018-07-13 by the reprex package (v0.2.0).
We can use map2
and apply the pmap
function for each row.
df2 <- df %>%
mutate(result = map2(params, funs, ~pmap(.l = .x, .f = .y)))
df2
# # A tibble: 3 x 4
# fun_id params funs result
# <chr> <list> <list> <list>
# 1 f1 <tibble [5 x 3]> <fn> <list [5]>
# 2 f2 <tibble [2 x 1]> <fn> <list [2]>
# 3 f3 <tibble [4 x 2]> <fn> <list [4]>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With