New dataframe of means across dataframes

Question

I have five dataframes of about 60 columns that I need to combine. They each have the same columns and I'm combining them with their means since they represent the same value. The issue isn't the ability to combine them, but doing so efficiently. Here is sample data/code:

#reproducible random data
set.seed(123)

dat1 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))
dat2 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))
dat3 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))

#This works but is inefficient

final_data<-data.frame(a=rowMeans(cbind(dat1$a,dat2$a,dat3$a)),
                       b=rowMeans(cbind(dat1$b,dat2$b,dat3$b)),
                       c=rowMeans(cbind(dat1$c,dat2$c,dat3$c)),
                       d=rowMeans(cbind(dat1$d,dat2$d,dat3$d)),
                       e=rowMeans(cbind(dat1$e,dat2$e,dat3$e)),
                       f=rowMeans(cbind(dat1$f,dat2$f,dat3$f))
)
#what results should look like
head(final_data)
#             a           b          c           d            e           f
# 1 0.573813625  0.17695841 -0.1434628 -0.53673101  0.353906578  0.24262067
# 2 0.135689926 -0.69206908  0.2888584 -0.37215810 -0.038298083 -0.23317107
# 3 0.004068807  0.44666945  0.5205118  0.09587453 -0.308528454  0.30516883
# 4 0.347100292  0.02401646  0.1409754 -0.15931120  0.587047386 -0.08684867
# 5 0.006529998  0.09010946  0.4932670  0.62606230 -0.005235813 -0.36967000
# 6 0.240225778 -0.45824825 -0.5000004  0.66131121  0.619480608  0.55650611

The issue here is that I don't want to rewrite a=rowMeans(cbind(dat1$a,dat2$a,dat3$a)) for each of 60 columns in the new data frame. Can you think of a good way to go about this?

EDIT: I'm going to accept the following answer since it allows me to set the columns to apply it over-

final_data1<-as.data.frame(sapply(colnames(dat1),function(i)
    rowMeans(cbind(dat1[,i],dat2[,i],dat3[,i]))))

> identical(final_data1,final_data)
[1] TRUE

Josh O'Brien · Accepted Answer

How about this?

(dat1+dat2+dat3)/3

Or, to first select/reorder a subset of the columns, and then add the resulting data.frames, you could do this:

jj <- letters[1:6]
Reduce(`+`, lapply(list(dat1,dat2,dat3), `[`, jj))/3

New dataframe of means across dataframes

Tags:

r

Jason

1 Answers

Josh O'Brien

Recent Activity

Donate For Us

New dataframe of means across dataframes

Tags:

r

Jason

1 Answers

Josh O'Brien

Related questions

Recent Activity

Donate For Us