Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Extracting information from list of lists of data.frames

I have two needs, both connected to a dataset similar to the reproducible one below. I have a list of 18 entities, each composed of a list of 17-19 data.frames. Reproducible dataset follows (there are matrices instead of data.frames, but I do not suppose that makes a difference):

test <- list(list(matrix(10:(50-1), ncol = 10), matrix(60:(100-1), ncol = 10), matrix(110:(150-1), ncol = 10)),
             list(matrix(200:(500-1), ncol = 10), matrix(600:(1000-1), ncol = 10), matrix(1100:(1500-1), ncol = 10)))
  1. I need to subset each dataframe/matrix into two parts (by a given number of rows) and save to a new list of lists
  2. Secondly, I need to extract and save a given column(s) out of every data.frame in a list of lists.

I have no idea how to go around doing it apart from for(), but I am sure it should be possible with apply() family of functions.

Thank you for reading

EDIT:

My expected output would look as follows:

extractedColumns <- list(list(matrix(10:(50-1), ncol = 10)[, 2], matrix(60:(100-1), ncol = 10)[, 2], matrix(110:(150-1), ncol = 10)[, 2]),
                         list(matrix(200:(500-1), ncol = 10)[, 2], matrix(600:(1000-1), ncol = 10)[, 2], matrix(1100:(1500-1), ncol = 10)[, 2]))


numToSubset <- 3
substetFrames <- list(list(list(matrix(10:(50-1), ncol = 10)["first length - numToSubset rows", ], matrix(10:(50-1), ncol = 10)["last numToSubset rows", ]), 
                           list(matrix(60:(100-1), ncol = 10)["first length - numToSubset rows", ], matrix(60:(100-1), ncol = 10)["last numToSubset rows", ]),
                                list(matrix(110:(150-1), ncol = 10)["first length - numToSubset rows", ], matrix(110:(150-1), ncol = 10)["last numToSubset rows", ])),
                      etc...)

It gets to look very messy, hope you can follow what I want.

like image 657
pun11 Avatar asked Mar 11 '23 01:03

pun11


1 Answers

You can use two nested lapplys:

lapply(test, function(x) lapply(x, '[', c(2, 3)))

Ouput:

[[1]]
[[1]][[1]]
[1] 11 12

[[1]][[2]]
[1] 61 62

[[1]][[3]]
[1] 111 112


[[2]]
[[2]][[1]]
[1] 201 202

[[2]][[2]]
[1] 601 602

[[2]][[3]]
[1] 1101 1102

Explanation

The first lapply will be applied on the two lists of test. Each one of those two lists contain another 3. The second lapply will iterate over those 3 lists and subset (thats the '[' function in the second lapply) columns c(2, 3).

Note: In the case of a matrix [ will subset elements 2 and 3 but the same function will subset columns when used on a data.frame.

Subsetting rows and columns

lapply is very flexible with the use of anonymous functions. By changing the code into:

#change rows and columns into what you need
lapply(test, function(x) lapply(x, function(y) y[rows, columns]))

You can specify any combination of rows or columns you want.

like image 142
LyzandeR Avatar answered Mar 23 '23 05:03

LyzandeR