Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate data in R

Tags:

r

aggregate

I'm looking for a dead simple example on how to use aggregate and calculate means in R.

Say, I have the following data frame:

A      B
100    85
200    95
300    110
400    105

And I want to calculate the mean values for some ranges with the following result:

RANGE         MEAN
100-200       90
300-400       107.5

How would I go about doing this, cast() or aggregate()?

like image 806
Johnny Avatar asked Jun 29 '12 11:06

Johnny


People also ask

How do you aggregate data in R?

In order to use the aggregate function for mean in R, you will need to specify the numerical variable on the first argument, the categorical (as a list) on the second and the function to be applied (in this case mean ) on the third. An alternative is to specify a formula of the form: numerical ~ categorical .

What does aggregate () do in R?

aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum.

How do you aggregate multiple variables in R?

We can use cbind() for combining one or more variables and the '+' operator for grouping multiple variables.

What is meant by aggregate data?

Aggregate data refers to numerical or non-numerical information that is (1) collected from multiple sources and/or on multiple measures, variables, or individuals and (2) compiled into data summaries or summary reports, typically for the purposes of public reporting or statistical analysis—i.e., examining trends, ...


2 Answers

Assuming your data frame is named "x":

aggregate(x$B, list(cut(x$A, breaks=c(0, 200, 400))), mean)
#     Group.1     x
# 1   (0,200]  90.0
# 2 (200,400] 107.5

With "data.table", you can do the following:

library(data.table)
as.data.table(x)[, .(RANGE = mean(B)), by = .(MEAN = cut(A, c(0, 200, 400)))]
#         MEAN RANGE
# 1:   (0,200]  90.0
# 2: (200,400] 107.5
like image 171
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 15 '22 06:11

A5C1D2H2I1M1N2O1R2T1


Here is a basic example of aggregate usage.

> foo = data.frame(A=c(100,200,300,400),B=c(85,95,110,105))
> aggregate(foo$B,by=list(foo$A<250),FUN=mean)
  Group.1     B
1   FALSE 107.5
2    TRUE  90.0
> 
like image 28
jrouquie Avatar answered Nov 15 '22 05:11

jrouquie