Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between ave() function and mean() function in R?

Tags:

r

What is the difference between ave() and mean() function in R?

For example I am trying to find out the average of a particular column of a dataframe in R?

I came across these two functions:

mean(dataset$age, na.rm= TRUE)
ave(dataset$age, FUN=function(x)mean(x, na.rm = TRUE))

The first function clearly gave me the mean as a single value. Whereas the second function also gave me the mean but had as many elements as there were non-missing values in the rows of the dataframe. Why is it so? And what is the use of a function like ave() when mean neatly gives the ave?

like image 713
Arjun Raaghav Avatar asked Dec 11 '22 02:12

Arjun Raaghav


1 Answers

Elaborating on @akrun's comments -

Suppose x <- 1:10.

1) mean always returns vector of length 1.

mean(x)
[1] 5.5

2) ave always returns a vector of same length as input vector

ave(x)
[1] 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5

The cool thing about ave is that you can also divide x into groups and apply any function FUN to get an output, again, of same length as x -

Let's divide x in two groups of 3 and 7 elements each i.e. rep(1:2, each = 5)

(grouping <- rep(1:2, c(3,7)))
[1] 1 1 1 2 2 2 2 2 2 2

# Now calculating mean for each group -    
ave(x, grouping, FUN = mean)
[1] 2 2 2 7 7 7 7 7 7 7

# calculating sum for each group
ave(x, grouping, FUN = sum)
[1]  6  6  6 49 49 49 49 49 49 49

# any custom function can be applied to ave, not just mean
ave(x, grouping, FUN = function(a) sum(a^2))
[1]  14  14  14 371 371 371 371 371 371 371

Above results are similar to what you'd get from a tapply with the difference being that output is of the same length as x.

tapply(x, grouping, mean)
1 2 
2 7 

tapply(x, grouping, sum)
1  2 
6 49 

tapply(x, grouping, function(a) sum(a^2))
1   2 
14 371

Finally, you can define your own function and pass it to FUN argument of ave so you are not restricted to just calculating the mean.

The output length = input length property makes ave very useful for adding columns to tabular data. Example- Calculate group mean (or other summary stats) and assign to original data

like image 64
Shree Avatar answered Dec 12 '22 15:12

Shree