Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting Information from Multi-Level Nested Lists

Tags:

r

purrr

tidyverse

I have a set of data in a nested list, such that each member of the top list has descriptors (but not necessarily the same descriptors), and then a list of children, and those children can have children, and so on. The depth of the children is arbitrary. A small minimal example:

family <- list(
  list(name = "Alice", age = 40, eyes = "blue", children = list(
    list(name = "Bob", age = 20, eyes = "blue"),
    list(name = "Charlie", age = 18, eyes = "brown")
)), 
  list(name = "Dan", age = 12, eyes = "green"),
  list(name = "Erin", age = 69, eyes = "green", children = list(
    list(name = "Frank", age = 45, eyes = "blue", children = list(
      list(name = "George", age = 24, eyes = "blue", children = list(
        list(name = "Harry", age = 2, eyes = "green")
      )), 

      list(name = "Ingrid", age = 22, eyes = "brown", hair = "brown"),
      list(name = "Jack", age = 29, eyes = "brown")
    )), 
    list(name = "Karen", age = 43),
    list(name = "Larry", age = 21, eyes = "blue")
  )) 
)

> str(family, max.level = 2)
List of 3
 $ :List of 4
  ..$ name    : chr "Alice"
  ..$ age     : num 40
  ..$ eyes    : chr "blue"
  ..$ children:List of 2
 $ :List of 3
  ..$ name: chr "Dan"
  ..$ age : num 12
  ..$ eyes: chr "green"
 $ :List of 4
  ..$ name    : chr "Erin"
  ..$ age     : num 69
  ..$ eyes    : chr "green"
  ..$ children:List of 3

Ideally, I would like to create a dataframe such that each row is a family member, and each column is an attribute from the list:

      name       age      eyes
    1 Alice       40      blue
    2 Bob         20      blue
    3 Charlie     18      brown
(etc)

But it is unclear how to do it recursively to an arbitrary depth. I have managed to get the top level members using map(family, ~ .$name), but I don't understand how to go deeper. It is complicated by the fact that some members don't have any children.

I have gone over the purrr documentation, but I can't find anything that seems like it would help. Maybe I am not using the right terminology.

Advice is appreciated. Thanks!

like image 562
CDoug Avatar asked Jan 14 '18 15:01

CDoug


2 Answers

You can always recursively walk the list and collect the pieces of information. In addition to map_dfr from purrr my solution uses defaults from plyr to handle cases of missing eye or hair color.

get_people <- function(x) {
  if (is.null(x)) return(NULL)
  if (!is.null(x$name)) { # if there's a name, we're at the level of a person
    children <- x$children
    x$children <- NULL
    row <- data.frame(plyr::defaults(x, list(age = NA, eyes = NA, hair = NA)),
                      stringsAsFactors = FALSE)
    rbind(row, get_people(children))
  }
  else {
    purrr::map_dfr(x, get_people)
  }
}

When you apply this to your list:

> get_people(family)
      name age  eyes  hair
1    Alice  40  blue  <NA>
2      Bob  20  blue  <NA>
3  Charlie  18 brown  <NA>
4      Dan  12 green  <NA>
5     Erin  69 green  <NA>
6    Frank  45  blue  <NA>
7   George  24  blue  <NA>
8    Harry   2 green  <NA>
9   Ingrid  22 brown brown
10    Jack  29 brown  <NA>
11   Karen  43  <NA>  <NA>
12   Larry  21  blue  <NA>
like image 75
Claus Wilke Avatar answered Nov 18 '22 21:11

Claus Wilke


Here's an option that iterates over the children elements, collecting people into a flattened list as it goes:

flatten_family <- function(x){
    ppl <- list()

    rflatten <- function(y){
        lapply(y, function(z){
            if(exists('children', z)) {
                ppl <<- c(ppl, list(z[-which(names(z) == 'children')]))
                rflatten(z$children)
            } else {
                ppl <<- c(ppl, list(z))
            }
        })
        ppl
    }

    rflatten(x)
}

dplyr::bind_rows(flatten_family(family))
#> # A tibble: 12 x 4
#>    name      age eyes  hair 
#>    <chr>   <dbl> <chr> <chr>
#>  1 Alice   40.0  blue  <NA> 
#>  2 Bob     20.0  blue  <NA> 
#>  3 Charlie 18.0  brown <NA> 
#>  4 Dan     12.0  green <NA> 
#>  5 Erin    69.0  green <NA> 
#>  6 Frank   45.0  blue  <NA> 
#>  7 George  24.0  blue  <NA> 
#>  8 Harry    2.00 green <NA> 
#>  9 Ingrid  22.0  brown brown
#> 10 Jack    29.0  brown <NA> 
#> 11 Karen   43.0  <NA>  <NA> 
#> 12 Larry   21.0  blue  <NA>
like image 22
alistaire Avatar answered Nov 18 '22 21:11

alistaire