Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating the mode in a multimodal list in Python

I'm trying to calculate the mode (most frequent value) of a list of values in Python. I came up with a solution, which gave out the wrong answer anyway, but I then realised that my data may be mutlimodal;

ie 1,1,2,3,4,4 mode = 1 & 4

Here is what I came up with so far:

def mode(valueList):
  frequencies = {}
  for value in valueList:
    if value in frequencies:
      frequencies[value] += 1
    else:
      frequencies[value] = 1
  mode = max(frequencies.itervalues())
  return mode

I think the problem here is that I'm outputting the value rather than the pointer of the maximum value. Anyway can anyone suggest a better way of doing this that could work where there is more than one mode? Or failing that how I can fix what I've got so far and identify a single mode?

As you can probably tell I'm very new to python, thanks for the help.

edit: should have mentioned I'm in Python 2.4

like image 405
Captastic Avatar asked Mar 05 '12 13:03

Captastic


People also ask

How do you find the mode value of a list in Python?

To find the mode with Python, we'll start by counting the number of occurrences of each value in the sample at hand. Then, we'll get the value(s) with a higher number of occurrences. Since counting objects is a common operation, Python provides the collections.

What does mode () do in Python?

mode() method calculates the mode (central tendency) of the given numeric or nominal data set.

How do you find the mode of a list of numbers?

The mode of a data set is the number that occurs most frequently in the set. To easily find the mode, put the numbers in order from least to greatest and count how many times each number occurs. The number that occurs the most is the mode!

Can a list have multiple modes?

The mode is the value that appears most frequently in a data set. A set of data may have one mode, more than one mode, or no mode at all. Other popular measures of central tendency include the mean, or the average of a set, and the median, the middle value in a set.


2 Answers

In Python >=2.7, use collections.Counter for frequency tables.

from collections import Counter
from itertools import takewhile

data = [1,1,2,3,4,4]
freq = Counter(data)
mostfreq = freq.most_common()
modes = list(takewhile(lambda x_f: x_f[1] == mostfreq[0][1], mostfreq))

Note the use of an anonymous function (lambda) that checks whether a pair (_, f) has the same frequency as the most frequent element.

like image 55
Fred Foo Avatar answered Sep 21 '22 20:09

Fred Foo


Note that starting in Python 3.8, the standard library includes the statistics.multimode function to return a list of the most frequently occurring values in the order they were first encountered:

from statistics import multimode

multimode([1, 1, 2, 3, 4, 4])
# [1, 4]
like image 40
Xavier Guihot Avatar answered Sep 23 '22 20:09

Xavier Guihot