Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unlist nested list columns in data.table

Tags:

r

data.table

Unlist nested list column in data.table. Assuming all the list elements are the same type. The list elements are named, the name has to be handled also.
It is somehow opposite operation to data.table aggregation to list column.
I think it is worth to have it in SO data.table knowledge base.
My current workaround approach below, I'm looking for a little bit more canonical answer.

library(data.table)
dt <- data.table(
    a = letters[1:3], 
    l = list(list(c1=6L, c2=4L), list(x=2L, y=4L, z=3L), list())
)
dt[]
#    a      l
# 1: a <list>
# 2: b <list>
# 3: c <list>
dt[,.(a = rep(a,length(l)),
      nm = names(unlist(l)),
      ul = unlist(l)),
   .(id = seq_along(a))
   ][, id := NULL
     ][]
#    a nm ul
# 1: a c1  6
# 2: a c2  4
# 3: b  x  2
# 4: b  y  4
# 5: b  z  3
# 6: c NA NA
like image 595
jangorecki Avatar asked Jul 15 '15 12:07

jangorecki


People also ask

How do I unlist a column in a list in R?

Use unlist() function to convert a list to a vector by unlisting the elements from a list. A list in R contains heterogeneous elements meaning can contain elements of different types whereas a vector in R is a basic data structure containing elements of the same data type.

How do I create a nested data frame?

Or more commonly, we can create nested data frames using tidyr::nest() . df %>% nest(x, y) specifies the columns to be nested; i.e. the columns that will appear in the inner data frame. Alternatively, you can nest() a grouped data frame created by dplyr::group_by() .

What is nesting in R?

Source: R/nest.R. nest.Rd. Nesting creates a list-column of data frames; unnesting flattens it back out into regular columns. Nesting is implicitly a summarising operation: you get one row for each group defined by the non-nested columns.


1 Answers

Not sure it is more "canonical" but here is a way to modify l so you can use by=a, considering you know the type of your data in list (with some improvements, thanks to @DavidArenburg):

dt[lengths(l) == 0, l := NA_integer_][, .(nm = names(unlist(l)), ul = unlist(l)), by = a]

#   a nm ul
#1: a c1  6
#2: a c2  4
#3: b  x  2
#4: b  y  4
#5: b  z  3
#6: c NA NA
like image 63
Cath Avatar answered Oct 13 '22 06:10

Cath