Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most frequent value (mode) by group [duplicate]

Tags:

r

I am trying to find the most frequent value by group. In the following example dataframe:

df<-data.frame(a=c(1,1,1,1,2,2,2,3,3),b=c(2,2,1,2,3,3,1,1,2))  
> df  
  a b  
1 1 2  
2 1 2  
3 1 1  
4 1 2  
5 2 3  
6 2 3  
7 2 1  
8 3 1  
9 3 2  

I would like to add a column 'c' which has the most occurring value in 'b' when its values are grouped by 'a'. I would like the following output:

> df  
  a b c  
1 1 2 2    
2 1 2 2    
3 1 1 2    
4 1 2 2    
5 2 3 3    
6 2 3 3    
7 2 1 3    
8 3 1 1   
9 3 2 1    

I tried using table and tapply but didn't get it right. Is there a fast way to do that?
Thanks!

like image 764
Asif Shakeel Avatar asked Mar 25 '15 12:03

Asif Shakeel


People also ask

What is the most repeated value in a data set?

Mode is the highest occurring figure in a series. It is the value in a series of observation that repeats maximum number of times and which represents the whole series as most of the values in the series revolves around this value. Therefore, mode is the value that occurs the most frequent times in a series.

How do you find the most frequent value in a column in R?

To find the most frequent factor value in an R data frame column, we can use names function with which. max function after creating the table for the particular column. This might be required while doing factorial analysis and we want to know which factor occurs the most.


1 Answers

Building on Davids comments your solution is the following:

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

library(dplyr)
df %>% group_by(a) %>% mutate(c=Mode(b))

Notice though that for the tie when df$a is 3 then the mode for b is 1.

like image 198
dimitris_ps Avatar answered Nov 03 '22 20:11

dimitris_ps