Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a list with inconsistent naming to a data frame, with variable depth

Tags:

list

dataframe

r

Consider the following list:

x <- list("a" = list("b", "c"),
          "d" = list("e", "f" = list("g", "h")),
          "i" = list("j", "k" = list("l" = list("m", "n" = list("o", "p")))))

It is worth noting that:

  • Not all names and elements are going to be of one character
  • There is an undetermined level of nesting a priori.

Given x, my aim is to output the data frame:

y <- data.frame(
  main_level = c(rep("a", 2), rep("d", 3), rep("i", 4)),
  level1 = c("b", "c", "e", rep("f", 2), "j", rep("k", 3)),
  level2 = c(NA, NA, NA, "g", "h", NA, "l", "l", "l"),
  level3 = c(NA, NA, NA,  NA,  NA, NA, "m", "n", "n"), 
  level4 = c(NA, NA, NA,  NA,  NA, NA, NA, "o", "p")
)
> y
  main_level level1 level2 level3 level4
1          a      b   <NA>   <NA>   <NA>
2          a      c   <NA>   <NA>   <NA>
3          d      e   <NA>   <NA>   <NA>
4          d      f      g   <NA>   <NA>
5          d      f      h   <NA>   <NA>
6          i      j   <NA>   <NA>   <NA>
7          i      k      l      m   <NA>
8          i      k      l      n      o
9          i      k      l      n      p

NOTE that a typo was corrected in y above.

The above implies that there will be a variable number of columns as well, depending on the depth of the nesting.

Solutions online that I've found, when it comes to nested lists, assume that the list naming structure is more or less consistent, which is of course not the case here; or that the list depth is identical. For instance, the solutions at How to convert a nested lists to dataframe in R? and Converting nested list to dataframe do not apply because they are much more consistent in their naming.

like image 232
Clarinetist Avatar asked Mar 13 '26 06:03

Clarinetist


2 Answers

Here's a way mainly relying on rrapply:

rrapply::rrapply(x, how = "melt") |>
  apply(1, function(row){
    newrow <- row[grep("[A-Za-z]", row)]
    length(newrow) <- purrr::vec_depth(x) - 1
    newrow
  }) |> 
  t() |> as.data.frame() |>
  `colnames<-`(c("main_level", paste0("level", 1:4)))

output

  main_level level1 level2 level3 level4
1          a      b   <NA>   <NA>   <NA>
2          a      c   <NA>   <NA>   <NA>
3          d      e   <NA>   <NA>   <NA>
4          d      f      g   <NA>   <NA>
5          d      f      h   <NA>   <NA>
6          i      j   <NA>   <NA>   <NA>
7          i      k      l      m   <NA>
8          i      k      l      n      o
9          i      k      l      n      p

Note that so far it is quite crude. There might be a better way to reshape the output of rrapply. For instance, row[grep("[A-Za-z]", row)] may not work every time. I have also not tested whether length(newrow) <- purrr::vec_depth(x) - 1 is a good way of guessing the length, but it works here.

like image 94
Maël Avatar answered Mar 14 '26 19:03

Maël


Here is a recursive function that has no assumptions other than the structure you described:

list_to_df <- function(l) {
  
  leaves <- list()
  
  go_deeper <- function(l, index=1, path=NULL) {

    # we can still go deeper    
    if (is.list(l[[index]])) {
      
      path <- c(path, names(l)[index])
      l <- l[[index]]
      
      lapply(seq_along(l), function(i) go_deeper(l, i, path))

    # this is the final node (leaf)      
    } else {
      
      leaves <<- c(leaves, list(c(path, l[[index]])))
    }
  }
  
  # this saves the paths to each last node (leaf) in 'leaves' as a side effect
  go_deeper(list(l))
  
  # now just make a data frame from the 'leaves' list
  len.max <- max(lengths(leaves))
  leaves <- sapply(leaves, function(x) c(x, rep(NA, len.max-length(x))))
  leaves <- as.data.frame(t(leaves))
  names(leaves) <- c('main_level', paste0('level', seq_len(ncol(leaves)-1)))
  
  leaves 
}
list_to_df(x)
#   main_level level1 level2 level3 level4
# 1          a      b   <NA>   <NA>   <NA>
# 2          a      c   <NA>   <NA>   <NA>
# 3          d      e   <NA>   <NA>   <NA>
# 4          d      f      g   <NA>   <NA>
# 5          d      f      h   <NA>   <NA>
# 6          i      j   <NA>   <NA>   <NA>
# 7          i      k      l      m   <NA>
# 8          i      k      l      n      o
# 9          i      k      l      n      p
like image 21
Robert Hacken Avatar answered Mar 14 '26 20:03

Robert Hacken



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!