Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

turning lists of lists of lists into a dataframe

I have a set of lists stored in the all_lists.

all_list=c("LIST1","LIST2")

From these, I would like to create a data frame such that

LISTn$findings${Coli}$character is entered into the n'th column with rowname from LISTn$rowname.

DATA

LIST1=list()
LIST1[["findings"]]=list(s1a=list(character="a1",number=1,string="a1type",exp="great"),
                        =list(number=2,string="b1type"),
                        in2a=list(character="c1",number=3,string="c1type"),
                        del3b=list(character="d1",number=4,string="d1type"))
LIST1[["rowname"]]="Row1"

LIST2=list()
LIST2[["findings"]]=list(s1a=list(character="a2",number=5,string="a2type",exp="great"),
                        s1b=list(character="b2",number=6,string="b2type"),
                        in2a=list(character="c2",number=7,string="c2type"),
                        del3b=list(character="d2",number=8,string="d2type"))
LIST2[["rowname"]]="Row2"

Please note that some characters are missing for which NA would suffice.

Desired output is this data frame:

       s1a  s1b in2a del3b 
Row1    a1   NA  c1   d1
Row2    a2   b2  c2   d2

There is about 1000 of these lists, speed is a factor. And each list is about 50mB after I load them through rjson::fromJSON(file=x)

The row and column names don't follow a particular pattern. They're names and attributes

like image 580
Shahin Avatar asked Jan 25 '23 11:01

Shahin


2 Answers

We can use a couple of lapply/sapply combinations to loop over the nested list and extract the elements that have "Row" as the name

do.call(rbind, lapply(mget(all_list), function(x) 
  sapply(lapply(x$findings[grep("^Row\\d+", names(x$findings))], `[[`, 
      "character"), function(x) replace(x, is.null(x), NA))))

Or it can be also done by changing the names to a single value and then extract all those

do.call(rbind, lapply(mget(all_list), function(x)  {
 x1 <- setNames(x$findings, rep("Row", length(x$findings)) )
 sapply(x1[names(x1)== "Row"], function(y) 
       pmin(NA, y$character[1], na.rm = TRUE)[1])}))
like image 67
akrun Avatar answered Feb 04 '23 02:02

akrun


purrr has a strong function called map_chr which is built for these tasks.

library(purrr)
sapply(mget(all_list),function(x) purrr::map_chr(x$findings,"character",.default=NA))
   %>% t
      %>% data.frame
like image 24
shahin shahsavari Avatar answered Feb 04 '23 02:02

shahin shahsavari