I have a set of data in a nested list, such that each member of the top list has descriptors (but not necessarily the same descriptors), and then a list of children, and those children can have children, and so on. The depth of the children is arbitrary. A small minimal example:
family <- list(
list(name = "Alice", age = 40, eyes = "blue", children = list(
list(name = "Bob", age = 20, eyes = "blue"),
list(name = "Charlie", age = 18, eyes = "brown")
)),
list(name = "Dan", age = 12, eyes = "green"),
list(name = "Erin", age = 69, eyes = "green", children = list(
list(name = "Frank", age = 45, eyes = "blue", children = list(
list(name = "George", age = 24, eyes = "blue", children = list(
list(name = "Harry", age = 2, eyes = "green")
)),
list(name = "Ingrid", age = 22, eyes = "brown", hair = "brown"),
list(name = "Jack", age = 29, eyes = "brown")
)),
list(name = "Karen", age = 43),
list(name = "Larry", age = 21, eyes = "blue")
))
)
> str(family, max.level = 2)
List of 3
$ :List of 4
..$ name : chr "Alice"
..$ age : num 40
..$ eyes : chr "blue"
..$ children:List of 2
$ :List of 3
..$ name: chr "Dan"
..$ age : num 12
..$ eyes: chr "green"
$ :List of 4
..$ name : chr "Erin"
..$ age : num 69
..$ eyes : chr "green"
..$ children:List of 3
Ideally, I would like to create a dataframe such that each row is a family member, and each column is an attribute from the list:
name age eyes
1 Alice 40 blue
2 Bob 20 blue
3 Charlie 18 brown
(etc)
But it is unclear how to do it recursively to an arbitrary depth. I have managed to get the top level members using map(family, ~ .$name)
, but I don't understand how to go deeper. It is complicated by the fact that some members don't have any children.
I have gone over the purrr documentation, but I can't find anything that seems like it would help. Maybe I am not using the right terminology.
Advice is appreciated. Thanks!
You can always recursively walk the list and collect the pieces of information. In addition to map_dfr
from purrr
my solution uses defaults
from plyr
to handle cases of missing eye or hair color.
get_people <- function(x) {
if (is.null(x)) return(NULL)
if (!is.null(x$name)) { # if there's a name, we're at the level of a person
children <- x$children
x$children <- NULL
row <- data.frame(plyr::defaults(x, list(age = NA, eyes = NA, hair = NA)),
stringsAsFactors = FALSE)
rbind(row, get_people(children))
}
else {
purrr::map_dfr(x, get_people)
}
}
When you apply this to your list:
> get_people(family)
name age eyes hair
1 Alice 40 blue <NA>
2 Bob 20 blue <NA>
3 Charlie 18 brown <NA>
4 Dan 12 green <NA>
5 Erin 69 green <NA>
6 Frank 45 blue <NA>
7 George 24 blue <NA>
8 Harry 2 green <NA>
9 Ingrid 22 brown brown
10 Jack 29 brown <NA>
11 Karen 43 <NA> <NA>
12 Larry 21 blue <NA>
Here's an option that iterates over the children
elements, collecting people into a flattened list as it goes:
flatten_family <- function(x){
ppl <- list()
rflatten <- function(y){
lapply(y, function(z){
if(exists('children', z)) {
ppl <<- c(ppl, list(z[-which(names(z) == 'children')]))
rflatten(z$children)
} else {
ppl <<- c(ppl, list(z))
}
})
ppl
}
rflatten(x)
}
dplyr::bind_rows(flatten_family(family))
#> # A tibble: 12 x 4
#> name age eyes hair
#> <chr> <dbl> <chr> <chr>
#> 1 Alice 40.0 blue <NA>
#> 2 Bob 20.0 blue <NA>
#> 3 Charlie 18.0 brown <NA>
#> 4 Dan 12.0 green <NA>
#> 5 Erin 69.0 green <NA>
#> 6 Frank 45.0 blue <NA>
#> 7 George 24.0 blue <NA>
#> 8 Harry 2.00 green <NA>
#> 9 Ingrid 22.0 brown brown
#> 10 Jack 29.0 brown <NA>
#> 11 Karen 43.0 <NA> <NA>
#> 12 Larry 21.0 blue <NA>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With