Unlist nested list column in data.table. Assuming all the list elements are the same type. The list elements are named, the name has to be handled also.
It is somehow opposite operation to data.table aggregation to list column.
I think it is worth to have it in SO data.table knowledge base.
My current workaround approach below, I'm looking for a little bit more canonical answer.
library(data.table)
dt <- data.table(
a = letters[1:3],
l = list(list(c1=6L, c2=4L), list(x=2L, y=4L, z=3L), list())
)
dt[]
# a l
# 1: a <list>
# 2: b <list>
# 3: c <list>
dt[,.(a = rep(a,length(l)),
nm = names(unlist(l)),
ul = unlist(l)),
.(id = seq_along(a))
][, id := NULL
][]
# a nm ul
# 1: a c1 6
# 2: a c2 4
# 3: b x 2
# 4: b y 4
# 5: b z 3
# 6: c NA NA
Use unlist() function to convert a list to a vector by unlisting the elements from a list. A list in R contains heterogeneous elements meaning can contain elements of different types whereas a vector in R is a basic data structure containing elements of the same data type.
Or more commonly, we can create nested data frames using tidyr::nest() . df %>% nest(x, y) specifies the columns to be nested; i.e. the columns that will appear in the inner data frame. Alternatively, you can nest() a grouped data frame created by dplyr::group_by() .
Source: R/nest.R. nest.Rd. Nesting creates a list-column of data frames; unnesting flattens it back out into regular columns. Nesting is implicitly a summarising operation: you get one row for each group defined by the non-nested columns.
Not sure it is more "canonical" but here is a way to modify l
so you can use by=a
, considering you know the type of your data in list (with some improvements, thanks to @DavidArenburg):
dt[lengths(l) == 0, l := NA_integer_][, .(nm = names(unlist(l)), ul = unlist(l)), by = a]
# a nm ul
#1: a c1 6
#2: a c2 4
#3: b x 2
#4: b y 4
#5: b z 3
#6: c NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With