Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate frequency of occurrence in an array using R

Tags:

r

I have an array

a <- c(1,1,1,1,1,2,3,4,5,5,5,5,5,6,7,7,7,7)

I would like to use some command that would tell me which is the most frequent number in the array?

is there a simple command for this?

like image 518
user1723765 Avatar asked Dec 12 '12 14:12

user1723765


People also ask

How do you find the occurrence frequency in R?

tabulate() function in R Language is used to count the frequency of occurrence of a element in the vector. This function checks for each element in the vector and returns the number of times it occurs in the vector. It will create a vector of the length of the maximum element present in the vector.

How do you find frequencies of a variable in R?

The sapply() method, which is used to compute the frequency of the occurrences of a variable within each column of the data frame. The sapply() method is used to apply functions over vectors or lists, and return outputs based on these computations.

How do I find most frequent value in R?

To find the most frequent factor value in an R data frame column, we can use names function with which. max function after creating the table for the particular column. This might be required while doing factorial analysis and we want to know which factor occurs the most.

How do I show a frequency table in R?

To create a frequency table in R, we can simply use table function but the output of table function returns a horizontal table. If we want to read the table in data frame format then we would need to read the table as a data frame using as. data. frame function.


3 Answers

The table() function is sufficient for this, and particularly useful if your data have more than one mode.

Consider the following options, all related to table() and max().


# Your vector
a = c(1,1,1,1,1,2,3,4,5,5,5,5,5,6,7,7,7,7)

# Basic frequency table
table(a)
# a
# 1 2 3 4 5 6 7 
# 5 1 1 1 5 1 4 

# Only gives me the value for highest frequency
# Doesn't tell me which number that is though
max(table(a))
# [1] 5

# Gives me a logical vector, which might be useful
# but not what you're asking for in this question
table(a) == max(table(a))
# a
#    1     2     3     4     5     6     7 
# TRUE FALSE FALSE FALSE  TRUE FALSE FALSE 

# This is probably more like what you're looking for
which(table(a) == max(table(a)))
# 1 5 
# 1 5 

# Or, maybe this
names(which(table(a) == max(table(a))))
# [1] "1" "5"

As indicated in the comments, in some cases you might want to see the two or three most commonly occurring values, in which case sort() is useful:

sort(table(a))
# a
# 2 3 4 6 7 1 5 
# 1 1 1 1 4 5 5 

You can also set a threshold for which values to return in your table. For instance, if you wanted to return only those numbers which occurred more than once:

sort(table(a)[table(a) > 1])
# a
# 7 1 5 
# 4 5 5 
like image 169
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 31 '22 21:10

A5C1D2H2I1M1N2O1R2T1


Use table() function:

## Your vector:
a <- c(1,1,1,1,1,2,3,4,5,5,5,5,5,6,7,7,7,7)

## Frequency table
> counts <- table(a)

## The most frequent and its value
> counts[which.max(counts)]
# 1
# 5

## Or simply the most frequent
> names(counts)[which.max(counts)]
# [1] "1"
like image 33
Rui Afonso Pereira Avatar answered Oct 31 '22 22:10

Rui Afonso Pereira


I wrote some personal code to find the mode and a little more (a few years ago. As Ananda showed, it's pretty obvious stuff) :

smode<-function(x){
    xtab<-table(x)
    modes<-xtab[max(xtab)==xtab]
    mag<-as.numeric(modes[1]) #in case mult. modes, this is safer
    #themodes<-names(modes)
    themodes<-as.numeric(names(modes))
    mout<-list(themodes=themodes,modeval=mag)
    return(mout)
    }

Blah blah copyright blah blah use as you like but don't make money off it.

like image 2
Carl Witthoft Avatar answered Oct 31 '22 21:10

Carl Witthoft