Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

aggregate a column by sum and another column by mean at the same time

Tags:

r

aggregate

I want to use aggregate function on a date frame but sum one column and take average of another column.

Here is an example data frame

Manager   Category  Amount  SqFt
Joe           Rent     150   500
Alice         Rent     250   700
Joe      Utilities      50   500
Alice    Utilities      75   700

I cannot do something like below. Is there an easy way to do it ?

Avg_CPSF=aggregate(cbind(Amount,SqFt)~Manager,data=aaa,FUN=c(sum,mean)

Eventually I need

Manager  Amount   SqFT
Joe       200      500
Alice     325      700

so that I can calculate Cost per Square Foot by doing Amount/SqFT

like image 737
M.Adams Avatar asked Feb 20 '13 16:02

M.Adams


People also ask

Which function is used to aggregate values from multiple columns in to one?

The aggregate() function in R is used to produce summary statistics for one or more variables in a data frame or a data.

What is column aggregation?

Aggregate the values of a column in the current table. The following aggregation functions are available: Average, Count, Maximum, Median, Minimum, Predominant, Standard deviation, Sum. All aggregation functions can be used to aggregate the values of a whole column: you obtain one output value.


1 Answers

There are several ways to do this. Here are some that I like (all assuming we're starting with a data.frame named "mydf"):

Using ave and unique

unique(within(mydf, {
  Amount <- ave(Amount, Manager, FUN = sum)
  SqFt <- ave(SqFt, Manager, FUN = mean)
  rm(Category)
}))
#   Manager Amount SqFt
# 1     Joe    200  500
# 2   Alice    325  700

Using data.table:

library(data.table)
DT <- data.table(mydf)
DT[, list(Amount = sum(Amount), SqFt = mean(SqFt)), by = "Manager"]
#    Manager Amount SqFt
# 1:     Joe    200  500
# 2:   Alice    325  700

Using "sqldf":

library(sqldf)
sqldf("select Manager, sum(Amount) `Amount`, 
      avg(SqFt) `SqFt` from mydf group by Manager")

Using aggregate and merge:

merge(aggregate(Amount ~ Manager, mydf, sum), 
      aggregate(SqFt ~ Manager, mydf, mean))
like image 175
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 27 '22 04:10

A5C1D2H2I1M1N2O1R2T1