I have a simple problem for you purrr-experts out there that has eluded my best googling efforts for some time. First, let's take a look at the nested-list data structure I'm trying to work with.
#R version 3.4.1
library(purrr) # version 0.2.4
library(dplyr) # version 0.7.4
f1 <- function(a, b, c) {a + b^c}
f2 <- function(x) {x * 2}
f3 <- function(y, z) {y * z}
These are to be passed through to each of f1, f2, and f3:
p1 <- data_frame(a = c(2, 4, 5, 7, 8),
b = c(1, 1, 2, 2, 2),
c = c(.5, 5, 1, 2, 3))
p2 <- data_frame(x = c(1, 4))
p3 <- data_frame(y = c(2, 2, 2, 3),
z = c(5, 4, 3, 2))
I am trying to keep my data wieldy, in a nice, neat rectangle. The "id" variable is the function name itself (in my real data, there are hundreds of these):
df <- data_frame(fun_id = c('f1', 'f2', 'f3'),
params = list(p1, p2, p3),
funs = list(f1, f2, f3))
Checking the structure shows us the list-columns for params and funs:
print(df)
# A tibble: 3 x 3
fun_id params funs
<chr> <list> <list>
1 f1 <tibble [5 x 3]> <fun>
2 f2 <tibble [2 x 1]> <fun>
3 f3 <tibble [4 x 2]> <fun>
Using purrr functions and perhaps dplyr::mutate, how do I get a new list-column in df called results in which each element is a list containing the outputs of executing the functions in funs with parameters taken from params, in a rowwise fashion?
I can get pmap to do what I want for a simple case:
> pmap(.l = p1, .f = f1)
[[1]]
[1] 3
[[2]]
[1] 5
[[3]]
[1] 7
[[4]]
[1] 11
[[5]]
[1] 16
But I really want to do this inside a data frame to keep everything straight. The following gets me to the right structure (a data frame with a list-column for the results), but only for one row and it's not generalized:
> df %>%
slice(1) %>%
mutate(results = list(pmap(.l = params[[1]], .f = funs[[1]])))
# A tibble: 1 x 4
fun_id params funs results
<chr> <list> <list> <list>
1 f1 <tibble [5 x 3]> <fun> <list [5]>
Thanks in advance for the help rounding out my problem!
P.S. I have looked at the following resources, but haven't found an answer yet:
purrr::pmap with dplyr::mutate
Using purrr::pmap within mutate to create list-column
http://statwonk.com/purrr.html
https://github.com/rstudio/cheatsheets/raw/master/purrr.pdf
https://jennybc.github.io/purrr-tutorial/index.html
There is a convenience function in purrr for exactly this situation; applying a list of functions to a corresponding list of parameters! It's called invoke_map and can be used with mutate as below. I think the main advantage over map2(~pmap()) is that if there are additional parameters to supply to any of the functions not included in params you can add them as named arguments in ... instead of needing to modify params.
library(tidyverse)
f1 <- function(a, b, c) {a + b^c}
f2 <- function(x) {x * 2}
f3 <- function(y, z) {y * z}
p1 <- data_frame(
a = c(2, 4, 5, 7, 8),
b = c(1, 1, 2, 2, 2),
c = c(.5, 5, 1, 2, 3)
)
p2 <- data_frame(x = c(1, 4))
p3 <- data_frame(
y = c(2, 2, 2, 3),
z = c(5, 4, 3, 2)
)
df <- data_frame(
fun_id = c("f1", "f2", "f3"),
params = list(p1, p2, p3),
funs = list(f1, f2, f3)
)
df2 <- df %>%
mutate(results = invoke_map(.f = funs, .x = params))
df2
#> # A tibble: 3 x 4
#> fun_id params funs results
#> <chr> <list> <list> <list>
#> 1 f1 <tibble [5 x 3]> <fn> <dbl [5]>
#> 2 f2 <tibble [2 x 1]> <fn> <dbl [2]>
#> 3 f3 <tibble [4 x 2]> <fn> <dbl [4]>
df2$results
#> [[1]]
#> [1] 3 5 7 11 16
#>
#> [[2]]
#> [1] 2 8
#>
#> [[3]]
#> [1] 10 8 6 6
Created on 2018-07-13 by the reprex package (v0.2.0).
We can use map2 and apply the pmap function for each row.
df2 <- df %>%
mutate(result = map2(params, funs, ~pmap(.l = .x, .f = .y)))
df2
# # A tibble: 3 x 4
# fun_id params funs result
# <chr> <list> <list> <list>
# 1 f1 <tibble [5 x 3]> <fn> <list [5]>
# 2 f2 <tibble [2 x 1]> <fn> <list [2]>
# 3 f3 <tibble [4 x 2]> <fn> <list [4]>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With