Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the statistical mode?

In R, mean() and median() are standard functions which do what you'd expect. mode() tells you the internal storage mode of the object, not the value that occurs the most in its argument. But is there is a standard library function that implements the statistical mode for a vector (or list)?

like image 811
Nick Avatar asked Mar 30 '10 17:03

Nick


1 Answers

One more solution, which works for both numeric & character/factor data:

Mode <- function(x) {   ux <- unique(x)   ux[which.max(tabulate(match(x, ux)))] } 

On my dinky little machine, that can generate & find the mode of a 10M-integer vector in about half a second.

If your data set might have multiple modes, the above solution takes the same approach as which.max, and returns the first-appearing value of the set of modes. To return all modes, use this variant (from @digEmAll in the comments):

Modes <- function(x) {   ux <- unique(x)   tab <- tabulate(match(x, ux))   ux[tab == max(tab)] } 
like image 92
Ken Williams Avatar answered Oct 07 '22 16:10

Ken Williams