I have a nested list of lists which contains some data frames. However, the data frames can appear at any level in the list. What I want to end up with is a flat list, i.e. just one level, where each element is only the data frames, with all other things discarded.
I have come up with a solution for this, but it looks very clunky and I am sure there ought to be a more elegant solution.
Importantly, I'm looking for something in base R, that can extract data frames at any level inside the nested list. I have tried unlist()
and dabbled with rapply()
but somehow not found a satisfying solution.
Example code follows: an example list, what I am actually trying to achieve, and my own solution which I am not very happy with. Thanks for any help!
# extract dfs from list
# example of multi-level list with some dfs in it
# note, dfs could be nested at any level
problem1 <- list(x1 = 1,
x2 = list(
x3 = "dog",
x4 = data.frame(cats = c(1, 2),
pigs = c(3, 4))
),
x5 = data.frame(sheep = c(1,2,3),
goats = c(4,5,6)),
x6 = list(a = 2,
b = "c"),
x7 = head(cars,5))
# want to end up with flat list like this (names format is optional)
result1 <- list(x2.x4 = data.frame(cats = c(1, 2),
pigs = c(3, 4)),
x5 = data.frame(sheep = c(1,2,3),
goats = c(4,5,6)),
x7 = head(cars,5))
# my solution (not very satisfactory)
exit_loop <- FALSE
while(exit_loop == FALSE){
# find dfs (logical)
idfs <- sapply(problem1, is.data.frame)
# check if all data frames
exit_loop <- all(idfs)
# remove anything not df or list
problem1 <- problem1[idfs | sapply(problem1, is.list)]
# find dfs again (logical)
idfs <- sapply(problem1, is.data.frame)
# unlist only the non-df part
problem1 <- c(problem1[idfs], unlist(problem1[!idfs], recursive = FALSE))
}
A Data frame is simply a List of a specified class called “data. frame”, but the components of the list must be vectors (numeric, character, logical), factors, matrices (numeric), lists, or even other data frames.
A nested data frame is a data frame where one (or more) columns is a list of data frames.
Or more commonly, we can create nested data frames using tidyr::nest() . df %>% nest(x, y) specifies the columns to be nested; i.e. the columns that will appear in the inner data frame. Alternatively, you can nest() a grouped data frame created by dplyr::group_by() .
Create dataframe using data.frame function with the do.call and cbind. cbind is used to bind the lists together by column into data frame. do.call is used to bind the cbind and the nested list together as a single argument in the Data frame function.
We can now extract single lists from this data frame using the $ operator: The previous R code has printed the first sub-list of our nested list (or the first variable of our new data frame respectively) to the RStudio console. Example 2 shows how to bind the sub-lists of a nested list as rows in a matrix object.
We first take the list of nested dictionary and extract the rows of data from it. Then we create another for loop to append the rows into the new list which was originally created empty. Finally we apply the DataFrames function in the pandas library to create the Data Frame.
do.call is used to bind the cbind and the nested list together as a single argument in the Data frame function. Also, store the whole data frame in a variable named data_frame and print the variable.
Maybe consider a simple recursive function like this
find_df <- function(x) {
if (is.data.frame(x))
return(list(x))
if (!is.list(x))
return(NULL)
unlist(lapply(x, find_df), FALSE)
}
Results
> find_df(problem1)
$x2.x4
cats pigs
1 1 3
2 2 4
$x5
sheep goats
1 1 4
2 2 5
3 3 6
$x7
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With