I'm looking for a dead simple example on how to use aggregate
and calculate means in R.
Say, I have the following data frame:
A B
100 85
200 95
300 110
400 105
And I want to calculate the mean values for some ranges with the following result:
RANGE MEAN
100-200 90
300-400 107.5
How would I go about doing this, cast()
or aggregate()
?
In order to use the aggregate function for mean in R, you will need to specify the numerical variable on the first argument, the categorical (as a list) on the second and the function to be applied (in this case mean ) on the third. An alternative is to specify a formula of the form: numerical ~ categorical .
aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum.
We can use cbind() for combining one or more variables and the '+' operator for grouping multiple variables.
Aggregate data refers to numerical or non-numerical information that is (1) collected from multiple sources and/or on multiple measures, variables, or individuals and (2) compiled into data summaries or summary reports, typically for the purposes of public reporting or statistical analysis—i.e., examining trends, ...
Assuming your data frame is named "x":
aggregate(x$B, list(cut(x$A, breaks=c(0, 200, 400))), mean)
# Group.1 x
# 1 (0,200] 90.0
# 2 (200,400] 107.5
With "data.table", you can do the following:
library(data.table)
as.data.table(x)[, .(RANGE = mean(B)), by = .(MEAN = cut(A, c(0, 200, 400)))]
# MEAN RANGE
# 1: (0,200] 90.0
# 2: (200,400] 107.5
Here is a basic example of aggregate
usage.
> foo = data.frame(A=c(100,200,300,400),B=c(85,95,110,105))
> aggregate(foo$B,by=list(foo$A<250),FUN=mean)
Group.1 B
1 FALSE 107.5
2 TRUE 90.0
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With