I have a dataframe like this:
experiment iter  results     A       1     30.0     A       2     23.0     A       3     33.3     B       1     313.0     B       2     323.0     B       3     350.0  ....   Is there a way to tally results by applying a function with conditions. In the above example, that condition is all iterations of a particular experiment.
A   sum of results (30 + 23, + 33.3) B   sum of results (313 + 323 + 350)   I am thinking of "apply" function, but can't find a way to get it work.
There are a lot of alternatives to do this. Note that if you are interested in another function different from sum, then just change the argument FUN=any.function, e.g, if you want mean, var length, etc, then just plug those functions into FUN argument, e.g, FUN=mean, FUN=var and so on. Let's explore some alternatives:
aggregate function in base.
> aggregate(results ~ experiment, FUN=sum, data=DF)   experiment results 1          A    86.3 2          B   986.0   Or maybe tapply ?
> with(DF, tapply(results, experiment, FUN=sum))     A     B   86.3 986.0    Also ddply from plyr package
> # library(plyr) > ddply(DF[, -2], .(experiment), numcolwise(sum))   experiment results 1          A    86.3 2          B   986.0  > ## Alternative syntax > ddply(DF, .(experiment), summarize, sumResults = sum(results))   experiment sumResults 1          A       86.3 2          B      986.0   Also the dplyr package
> require(dplyr) > DF %>% group_by(experiment) %>% summarise(sumResults = sum(results)) Source: local data frame [2 x 2]    experiment  sumResults 1          A        86.3 2          B       986.0   Using sapply and split, equivalent to tapply.
> with(DF, sapply(split(results, experiment), sum))     A     B   86.3 986.0    If you are concern about timing, data.table is your friend:
> # library(data.table) > DT <- data.table(DF) > DT[, sum(results), by=experiment]    experiment    V1 1:          A  86.3 2:          B 986.0   Not so popular, but doBy package is nice (equivalent to aggregate, even in syntax!)
> # library(doBy) > summaryBy(results~experiment, FUN=sum, data=DF)   experiment results.sum 1          A        86.3 2          B       986.0   Also by helps in this situation
> (Aggregate.sums <- with(DF, by(results, experiment, sum))) experiment: A [1] 86.3 -------------------------------------------------------------------------  experiment: B [1] 986   If you want the result to be a matrix then use either cbind or rbind
> cbind(results=Aggregate.sums)   results A    86.3 B   986.0   sqldf from sqldf package also could be a good option
> library(sqldf) > sqldf("select experiment, sum(results) `sum.results`       from DF group by experiment")   experiment sum.results 1          A        86.3 2          B       986.0   xtabs also works (only when FUN=sum)
> xtabs(results ~ experiment, data=DF) experiment     A     B   86.3 986.0 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With