What is the difference between the functions tapply and ave?

Tags:

I can't wrap my mind around the ave function. I read the help and searched the net but I still cannot understand what it does. I understand it applies some function on a subset of observation but not in the same way as for example tapply

Could someone please enlighten me perhaps with a small example?

Thanks, and excuse me for perhaps an unusual request.

460

asked Mar 09 '14 22:03

ECII

1 Answers

tapply returns a single result for each factor level. ave also produces a single result per factor level, but it copies this value to each position in the original data.

ave is handy for producing a new column in a data frame with summary data.

A short example:

tapply(iris$Sepal.Length, iris$Species, FUN=mean)
    setosa versicolor  virginica 
     5.006      5.936      6.588

One value, the mean for each factor level.

ave on iris produces 150 results, which line up with the original data frame:

 ave(iris$Sepal.Length, iris$Species, FUN=mean)
  [1] 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006
 [17] 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006
 [33] 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006 5.006
 [49] 5.006 5.006 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936
 [65] 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936
 [81] 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936 5.936
 [97] 5.936 5.936 5.936 5.936 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588
[113] 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588
[129] 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588 6.588
[145] 6.588 6.588 6.588 6.588 6.588 6.588

As noted in the comments, here the single value is being recycled to fill each location in the original data.

If the function returns multiple values, these are recycled if necessary to fill in the locations. For example:

d <- data.frame(a=rep(1:2, each=5), b=1:10)
ave(d$b, d$a, FUN=rev)
 [1]  5  4  3  2  1 10  9  8  7  6

Thanks to Josh and thelatemail.

124

answered Nov 14 '22 23:11

Matthew Lundberg

Related questions
                            
                                R/zoo: index entries in ‘order.by’ are not unique
                            
                                combine multiple pdf plots into one file
                            
                                quick standard deviation with weights
                            
                                rowSums but keeping NA values
                            
                                Specify function parameters in do.call
                            
                                How to convert specific time format to timestamp in R? [duplicate]
                            
                                Run multiple R Scripts in R Studio
                            
                                neuralnet in R - Getting same output for all input values
                            
                                How to plot multiple stacked histograms together in R?
                            
                                Pattern matching and replacement in R
                            
                                Reduce unused area of sidebarPanel
                            
                                Multiplying two functions
                            
                                Equivalent of R's tapply() in Python Pandas
                            
                                Using as.POSIXct in R giving na for identical character structures
                            
                                Using toString function in R
                            
                                rep_each in Rcpp sugar
                            
                                igraph get edge from - to value
                            
                                How to sort a data frame by user-defined (e.g. non-alphabetic order) [duplicate]
                            
                                Extract rows from R data frame based on factors (strings)
                            
                                Fine tuning ggplot2's geom boxplot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between the functions tapply and ave?

Tags:

r

tapply

aggregate

ECII

People also ask

1 Answers

Matthew Lundberg

Recent Activity

Donate For Us