Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate group mean, sum, or other summary stats. and assign column to original data

I want to calculate mean (or any other summary statistics of length one, e.g. min, max, length, sum) of a numeric variable ("value") within each level of a grouping variable ("group").

The summary statistic should be assigned to a new variable which has the same length as the original data. That is, each row of the original data should have a value corresponding to the current group value - the data set should not be collapsed to one row per group. For example, consider group mean:

Before

id  group  value 1   a      10 2   a      20 3   b      100 4   b      200 

After

id  group  value  grp.mean.values 1   a      10     15 2   a      20     15 3   b      100    150 4   b      200    150 
like image 515
Mike Avatar asked May 19 '11 04:05

Mike


People also ask

How do I summarize a column in R?

summary statistic is computed using summary() function in R. summary() function is automatically applied to each column. The format of the result depends on the data type of the column. If the column is a numeric variable, mean, median, min, max and quartiles are returned.

How do you sum a variable in R?

Sum Function in R – sum() sum of a particular column of a dataframe. sum of a group can also calculated using sum() function in R by providing it inside the aggregate function. with sum() function we can also perform row wise sum using dplyr package and also column wise sum lets see an example of each.


1 Answers

Have a look at the ave function. Something like

df$grp.mean.values <- ave(df$value, df$group) 

If you want to use ave to calculate something else per group, you need to specify FUN = your-desired-function, e.g. FUN = min:

df$grp.min <- ave(df$value, df$group, FUN = min) 
like image 72
Henrico Avatar answered Oct 05 '22 04:10

Henrico