I have a list of sub-lists that I wish to convert to a data frame (specifically as a tibble); for example:
myList <- list(
list(var1=1,var2=2,var3=3,var4=4,var5=5,var6=6),
list(var1=4,var2=5,var3=6,var4=7,var5=8,var6=9),
list(var1=7,var2=8,var3=9,var4=1,var5=2,var6=3)
)
Using the following code, I can extract chosen variables to a tibble data frame
myDF <- tbl_df(cbind(
var1 = lapply(myList, '[[', "var1"),
var2 = lapply(myList, '[[', "var2"),
var5 = lapply(myList, '[[', "var5"),
var6 = lapply(myList, '[[', "var6")
))
But it is quite verbose. Is there a more succinct way (perhaps using a purrr map function) that can pull chosen sub-elements out of each list and populate them into a row?
Further, if the sub-lists contain lists themselves, how best to extract elements of those lists; e.g:
myList <- list(
list(var1=1,var2=2,var3=3,list4=list(varA="a",varB="b")),
list(var1=4,var2=5,var3=6,list4=list(varA="c",varB="d")),
list(var1=7,var2=8,var3=9,list4=list(varA="e",varB="f"))
)
How could I get something like the following to work:
myDF <- tbl_df(cbind(
var1 = lapply(myList, '[[', "var1"),
var2 = lapply(myList, '[[', "var2"),
var4 = lapply(myList, '[[', "list4$varA")
))
Where I want to extract a specific element from list 4, but using $ notation to drill down to the next level does not work?
Since data frames are just lists, if your list isnt nested more than once.
library(tidyverse)
myList %>%
map(as.data.frame) %>%
bind_rows() %>%
select(var1, var2, var5, var6)
# var1 var2 var5 var6
# 1 1 2 5 6
# 2 4 5 8 9
# 3 7 8 2 3
Or even the following, bind_rows()
actually works on a list of lists.
myList %>%
bind_rows() %>%
select(var1, var2, var5, var6)
# var1 var2 var5 var6
# <dbl> <dbl> <dbl> <dbl>
# 1 1.00 2.00 5.00 6.00
# 2 4.00 5.00 8.00 9.00
# 3 7.00 8.00 2.00 3.00
However sometimes it may be the case where each list element has only some common elements and you want to select only those specifically
myList %>%
map(as.data.frame) %>%
map(~ select(.x, var1, var2, var5, var6)) %>%
bind_rows()
# var1 var2 var5 var6
# 1 1 2 5 6
# 2 4 5 8 9
# 3 7 8 2 3
For cases where the lists are nested more than once investigate using flatten()
from purrr
myList2 <- list(
list(var1=1,var2=2,var3=3,list4=list(varA="a",varB="b")),
list(var1=4,var2=5,var3=6,list4=list(varA="c",varB="d")),
list(var1=7,var2=8,var3=9,list4=list(varA="e",varB="f"))
)
myList2 %>%
map(flatten) %>%
bind_rows()
# var1 var2 var3 varA varB
# <dbl> <dbl> <dbl> <chr> <chr>
# 1 1.00 2.00 3.00 a b
# 2 4.00 5.00 6.00 c d
# 3 7.00 8.00 9.00 e f
and apply select()
as desired, the names will be the names of the respective elements. Be very careful with duplicate names in different elements as it will only take one.
There may be situations where the enframe()
function from tibble
is also useful.
For the first case, a possible base-R solution:
> data.frame(do.call(rbind, myList))[c("var1", "var2", "var5", "var5")]
var1 var2 var5 var6
1 1 2 5 6
2 4 5 8 9
3 7 8 2 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With