Suppose I had a list of data.frames (of equal rows and columns)
dat1 <- as.data.frame(matrix(rnorm(25), ncol=5)) dat2 <- as.data.frame(matrix(rnorm(25), ncol=5)) dat3 <- as.data.frame(matrix(rnorm(25), ncol=5)) all.dat <- list(dat1=dat1, dat2=dat2, dat3=dat3)
How can I return a single data.frame that is the mean (or sum, etc.) for each element in the data.frames across the list (e.g., mean of first row and first column from lists 1, 2, 3 and so on)? I have tried lapply
and ldply
in plyr
but these return the statistic for each data.frame within the list.
Edit: For some reason, this was retagged as homework. Not that it matters either way, but this is not a homework question. I just don't know why I can't get this to work. Thanks for any insight!
Edit2: For further clarification: I can get the results using loops, but I was hoping that there were a way (a simpler and faster way because the data I am using has data.frames that are 12 rows by 100 columns and there is a list of 1000+ of these data frames).
z <- matrix(0, nrow(all.dat$dat1), ncol(all.dat$dat1)) for(l in 1:nrow(all.dat$dat1)){ for(m in 1:ncol(all.dat$dat1)){ z[l, m] <- mean(unlist(lapply(all.dat, `[`, i =l, j = m))) } }
With a result of the means:
> z [,1] [,2] [,3] [,4] [,5] [1,] -0.64185488 0.06220447 -0.02153806 0.83567173 0.3978507 [2,] -0.27953054 -0.19567085 0.45718399 -0.02823715 0.4932950 [3,] 0.40506666 0.95157856 1.00017954 0.57434125 -0.5969884 [4,] 0.71972821 -0.29190645 0.16257478 -0.08897047 0.9703909 [5,] -0.05570302 0.62045662 0.93427522 -0.55295824 0.7064439
I was wondering if there was a less clunky and faster way to do this. Thanks!
To calculate the average in R, use the mean() function. The average is calculated by taking a sum of the input values and dividing by the number of values in the input data. The Mean is the sum of its data values divided by the count of the data.
Creating a list of Dataframes. To create a list of Dataframes we use the list() function in R and then pass each of the data frame you have created as arguments to the function.
A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column.
Here is a one liner with plyr
. You can replace mean
with any other function that you want.
ans1 = aaply(laply(all.dat, as.matrix), c(2, 3), mean)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With