Apparently dplyr's summarise function doesn't include an option for "mode". Based on the simple data frame example below, I would like to determine the mode, or most frequently repeating number, for each group of "Category." So for group "A", the mode is 22, for "B", it's 12 and 14, and there is no repeating number for "C".
I found some examples of functions online, but none addressed the situation when there are no repeating numbers in a group. Is there a need for a custom function, or is there a mode option somewhere? I don't want to rely on any other specialized packages just for their mode function. It would be nice to find an elegant and simple solutioin using a combination of base R, dplyr, tidy, etc.
If a custom function is used, it will have to work when there are no repeating numbers, as well as when there are more than one equally repeating number.
Any help would be greatly appreciated! This seems like it should be an easy solutioin in R, so I was surprised to learn that there is no simple summarise_each(funs(mode)... option.
If a custom function is used, please break it down with explanations. I'm still relatively new to R functions.
Category<-c("A","B","B","C","A","A","A","B","C","B","C","C")
Number<-c(22,12,12,8,22,22,18,14,10,14,1,3)
DF<-data.frame(Category,Number)
We can use
Mode <- function(x) {
ux <- unique(x)
if(!anyDuplicated(x)){
NA_character_ } else {
tbl <- tabulate(match(x, ux))
toString(ux[tbl==max(tbl)])
}
}
DF %>%
group_by(Category) %>%
summarise(NumberMode = Mode(Number))
# Category NumberMode
# <fctr> <chr>
#1 A 22
#2 B 12, 14
#3 C <NA>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With