Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple functions in a single tapply or aggregate statement

Is it possible to include two functions within a single tapply or aggregate statement?

Below I use two tapply statements and two aggregate statements: one for mean and one for SD.
I would prefer to combine the statements.

my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x)  }))

with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN =   sd)

# this does not work:

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))

# I would also prefer that the output be formatted something similar to that 
# show below.  `aggregate` formats the output perfectly.  I just cannot figure 
# out how to implement two functions in one statement.

  age    sex   mean        sd
adult female   97.5  3.535534
adult   male     90        NA
young female   80.0        NA
young   male     75        NA

I can always run two separate statements and merge the output. I was just hoping there might be a slightly more convenient solution.

I found the answer below posted here: Apply multiple functions to column using tapply

f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )

However, neither the rows or columns are labeled.

     [,1]     [,2]
[1,] 97.5 3.535534
[2,] 80.0       NA
[3,] 90.0       NA
[4,] 75.0       NA

I would prefer a solution in base R. A solution from the plyr package was posted at the link above. If I can add the correct row and column headings to the above output, it would be perfect.

like image 941
Mark Miller Avatar asked Mar 05 '13 03:03

Mark Miller


3 Answers

But these should have:

with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))

with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
# Not a nice structure but the results are in there

with(my.Data, aggregate(weight ~ age + sex, FUN =  function(x) c( SD = sd(x), MN= mean(x) ) ) )
    age    sex weight.SD weight.MN
1 adult female  3.535534 97.500000
2 young female        NA 80.000000
3 adult   male        NA 90.000000
4 young   male        NA 75.

The principle to be adhered to is to have your function return "one thing" which could be either a vector or a list but cannot be the successive invocation of two function calls.

like image 134
IRTFM Avatar answered Nov 15 '22 11:11

IRTFM


If you'd like to use data.table, it has with and by built right into it:

library(data.table)
myDT <- data.table(my.Data, key="animal")


myDT[, c("mean", "sd") := list(mean(weight), sd(weight)), by=list(age, sex)]


myDT[, list(mean_Aggr=sum(mean(weight)), sd_Aggr=sum(sd(weight))), by=list(age, sex)]
     age    sex mean_Aggr   sd_Aggr
1: adult female     96.0  3.6055513
2: young   male     76.5  2.1213203
3: adult   male     91.0  1.4142136
4: young female     84.5  0.7071068

I used a slightly different data set so as to not have NA values for sd

like image 27
Ricardo Saporta Avatar answered Nov 15 '22 13:11

Ricardo Saporta


In the spirit of sharing, if you are familiar with SQL, you might also consider the "sqldf" package. (Emphasis added because you do need to know, for instance, that mean is avg in order to get the results you want.)

sqldf("select age, sex, 
      avg(weight) `Wt.Mean`, 
      stdev(weight) `Wt.SD` 
      from `my.Data` 
      group by age, sex")
    age    sex Wt.Mean    Wt.SD
1 adult female    97.5 3.535534
2 adult   male    90.0 0.000000
3 young female    80.0 0.000000
4 young   male    75.0 0.000000
like image 7
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 15 '22 13:11

A5C1D2H2I1M1N2O1R2T1