storing an R function output list in a column of lists for further processing

Question

OK, this is a pretty basic question that I am having a hard time finding an answer. I have functions that return several results that are returned as a list. I want to store that output list in a dataframe. The data frame also has the variables that are used in the function. For example:

library(dplyr)
### function
testFunc <- function(a){
  a = a
  b = a+1
  c = list(out1=a, out2=b)
  return(c)
}
### data
dat <- data.frame(x=1:5)

### dplyr processing
datProcessed <- dat %>%
  mutate(calcd = testFunc(x))

### Fails, `calcd` must be size 10 or 1, not 2.

However, if the output is a single item, of course it works:

datProcessed <- dat %>%
  mutate(calcd = testFunc(x)$out2)

How do I store a list output in the dataframe column of lists using a dplyr pipe?

r2evans · Accepted Answer

Here are some options, depending wholly on your expected output and what you're going to do with it next.

(BTW: I'm using tibble(dat) instead of dat only to differentiate between vector-columns and list-columns, your production use does not need tibble(..).)

If you want both vectors returned from testFunc() as individual columns in dat, then we can just do
```
tibble(dat) |>
  mutate(as.data.frame(testFunc(x)))
# # A tibble: 5 × 3
#       x  out1  out2
#   <int> <int> <dbl>
# 1     1     1     2
# 2     2     2     3
# 3     3     3     4
# 4     4     4     5
# 5     5     5     6
```
This works because mutate(.) (and other similar verb-functions in dplyr) appends columns if the value of the unnamed argument is a frame itself (it does not work with a named-list, though the differences between the two are very minor).

If you want each of the pairs of the return values stored in a list-column per-row in dat, then we can use purrr::transpose:

out <- dat |>
  mutate(calcd = purrr::transpose(testFunc(x)))
out
#   x calcd
# 1 1  1, 2
# 2 2  2, 3
# 3 3  3, 4
# 4 4  4, 5
# 5 5  5, 6

tibble(out)
# # A tibble: 5 × 2
#       x calcd           
#   <int> <list>          
# 1     1 <named list [2]>
# 2     2 <named list [2]>
# 3     3 <named list [2]>
# 4     4 <named list [2]>
# 5     5 <named list [2]>

out$calcd[[1]]
# $out1
# [1] 1
# $out2
# [1] 2

In this second form, each element in $calcd is a named list with one value each (based on how your testFunc(.) worked).

Both methods assume that the return from testFunc(.) is a named list of vectors where each vector is the same length as the number of rows.

If you aren't familiar with what purrr::transpose does, compare the change:

str(testFunc(dat$x))
# List of 2
#  $ out1: int [1:5] 1 2 3 4 5
#  $ out2: num [1:5] 2 3 4 5 6

str(purrr::transpose(testFunc(dat$x)))
# List of 5
#  $ :List of 2
#   ..$ out1: int 1
#   ..$ out2: num 2
#  $ :List of 2
#   ..$ out1: int 2
#   ..$ out2: num 3
#  $ :List of 2
#   ..$ out1: int 3
#   ..$ out2: num 4
#  $ :List of 2
#   ..$ out1: int 4
#   ..$ out2: num 5
#  $ :List of 2
#   ..$ out1: int 5
#   ..$ out2: num 6

deschen · Answer

You probably want to apply your function to each row individually, in which case you could do:

library(tidyverse)
dat %>%
  mutate(calcd = apply(across(x), 1, testFunc))

This returns:

  x calcd
1 1  1, 2
2 2  2, 3
3 3  3, 4
4 4  4, 5
5 5  5, 6


'data.frame':   5 obs. of  2 variables:
 $ x    : int  1 2 3 4 5
 $ calcd:List of 5
  ..$ :List of 2
  .. ..$ out1: Named int 1
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 2
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 2
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 3
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 3
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 4
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 4
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 5
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 5
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 6
  .. .. ..- attr(*, "names")= chr "x"

storing an R function output list in a column of lists for further processing

Tags:

list

r

dplyr

TC1

2 Answers

r2evans

deschen

Recent Activity

Donate For Us

storing an R function output list in a column of lists for further processing

Tags:

list

r

dplyr

TC1

2 Answers

r2evans

deschen

Related questions

Recent Activity

Donate For Us