Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert nested list elements into data frame and bind the result into one data frame

Tags:

r

I have a nested lists like this :

x <- list(x = list(a = 1, 
                   b = 2), 
          y = list(a = 3, 
                   b = 4))

And I would like to convert the nested list into data.frames and then bind all data frames into one.

For this level of nesting I can do it with this line :

do.call(rbind.data.frame, lapply(x, as.data.frame, stringsAsFactors = FALSE))

So the result is :

  a b
x 1 2
y 3 4

My problem is that I would like to achieve that regardless of the level of nesting. Another example with this list :

x <- list(X = list(x = list(a = 1, 
                       b = 2), 
              y = list(a = 3, 
                       b = 4)),
     Y = list(x = list(a = 1, 
                       b = 2), 
              y = list(a = 3, 
                       b = 4)))

do.call(rbind.data.frame, lapply(x, function(x) do.call(rbind.data.frame, lapply(x, as.data.frame, stringsAsFactors = FALSE))))

    a b
X.x 1 2
X.y 3 4
Y.x 1 2
Y.y 3 4

Does anyone has an idea to generelized this to any level of nesting ? Thanks for any help

like image 410
Julien Navarre Avatar asked Apr 24 '17 14:04

Julien Navarre


People also ask

How do I turn a large list into a Dataframe in R?

First, create a large list. Then use the Map function on the list and convert it to dataframe using the as. data. frame function in R.

How do you make a nested Dataframe in R?

Or more commonly, we can create nested data frames using tidyr::nest() . df %>% nest(x, y) specifies the columns to be nested; i.e. the columns that will appear in the inner data frame. Alternatively, you can nest() a grouped data frame created by dplyr::group_by() .


1 Answers

Borrowing from Spacedman and flodel here, we can define the following pair of recursive functions:

library(tidyverse)  # I use dplyr and purrr here, plus tidyr further down below

depth <- function(this) ifelse(is.list(this), 1L + max(sapply(this, depth)), 0L)

bind_at_any_depth <- function(l) {
  if (depth(l) == 2) {
    return(bind_rows(l))
  } else {
    l <- at_depth(l, depth(l) - 2, bind_rows)
    bind_at_any_depth(l)
  }
}

We can now bind any arbitrary depth list into a single data.frame:

bind_at_any_depth(x)
# A tibble: 2 × 2
      a     b
  <dbl> <dbl>
1     1     2
2     3     4
bind_at_any_depth(x_ext) # From P Lapointe
# A tibble: 5 × 2
      a     b
  <dbl> <dbl>
1     1     2
2     5     6
3     7     8
4     1     2
5     3     4

If you want to keep track of the origin of each row, you can use this version:

bind_at_any_depth2 <- function(l) {
  if (depth(l) == 2) {
    l <- bind_rows(l, .id = 'source')
    l <- unite(l, 'source', contains('source'))
    return(l)
  } else {
    l <- at_depth(l, depth(l) - 2, bind_rows, .id = paste0('source', depth(l)))
    bind_at_any_depth(l)
  }
}

This will add a source column:

bind_at_any_depth2(x_ext)
# A tibble: 5 × 3
  source     a     b
*  <chr> <dbl> <dbl>
1  X_x_1     1     2
2  X_y_z     5     6
3 X_y_zz     7     8
4  Y_x_1     1     2
5  Y_y_1     3     4

Note: At some point you can use purrr::depth, and will need to change at_depth to modify_depth when their new version rolls out to CRAN (thanks @ManuelS).

like image 161
Axeman Avatar answered Sep 21 '22 08:09

Axeman