Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get means in each column?

I have a big data frame like this:

ID  c_Al   c_D    c_Hy      occ
A     0     0      0        2306
B     0     0      0        3031
C     0     0      1        2581
D     0     0      1        1917
E     0     0      1        2708
F     0     1      0        2751
G     0     1      0        1522
H     0     1      0        657
I     0     1      1        469
J     0     1      1        2629
L     1     0      0        793
L     1     0      0        793
M     1     0      0        564
N     1     0      1        2617
O     1     0      1        1167
P     1     0      1        389
Q     1     0      1        294
R     1     1      0        1686
S     1     1      0        992

How can I get means in each column?

               0        1
    c_Al    1506.2  1641.2
    c_D     748.6   1467.5
    c_Hy    1506.2  1641.2

I have tried aggregate(occ~c_Al, mean, data=table2), but it has to be done many times; ddply has the same results, or for(i in 1:dim(table2)[1]){ aggregate(occ~[,i], mean, data=table2)}, but it can't work.

like image 940
Vivian Avatar asked Dec 01 '22 19:12

Vivian


1 Answers

I would just use melt and dcast from "reshape2":

library(reshape2)
dfL <- melt(table2, id.vars = c("ID", "occ"))
dcast(dfL, variable ~ value, value.var = "occ", fun.aggregate = mean)
#   variable        0        1
# 1     c_Al 2057.100 1032.778
# 2      c_D 1596.667 1529.429
# 3     c_Hy 1509.500 1641.222

Of course, base R can handle this just fine too.

Here, I've used tapply and vapply:

vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L))
#       c_Al      c_D     c_Hy
# 0 2057.100 1596.667 1509.500
# 1 1032.778 1529.429 1641.222
t(vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L)))
#             0        1
# c_Al 2057.100 1032.778
# c_D  1596.667 1529.429
# c_Hy 1509.500 1641.222
like image 184
A5C1D2H2I1M1N2O1R2T1 Avatar answered Dec 04 '22 02:12

A5C1D2H2I1M1N2O1R2T1