Make use of Python Counter which returns count of each element in the list. Thus, we simply find the most common element by using most_common() method.
Solution StepsmaxFreq = 0, mostFrequent = -1. Now we scan the sorted array using a loop till i < n. Inside the loop, we initialize a variable countFreq to track the frequency count i.e. countFreq = 1. We start from the first element (i =0) and search its consecutive occurrences using a loop till X[i] = X[i + 1].
Using bincount( ). argmax( ) function — We can get the most frequent element in numpy array using bincount function.
Write a Python program to get the most frequent element in a given list of numbers. Use set() to get the unique values in nums. Use max() to find the element that has the most appearances.
If your list contains all non-negative ints, you should take a look at numpy.bincounts:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html
and then probably use np.argmax:
a = np.array([1,2,3,1,2,1,1,1,3,2,2,1])
counts = np.bincount(a)
print(np.argmax(counts))
For a more complicated list (that perhaps contains negative numbers or non-integer values), you can use np.histogram
in a similar way. Alternatively, if you just want to work in python without using numpy, collections.Counter
is a good way of handling this sort of data.
from collections import Counter
a = [1,2,3,1,2,1,1,1,3,2,2,1]
b = Counter(a)
print(b.most_common(1))
You may use
values, counts = np.unique(a, return_counts=True)
ind = np.argmax(counts)
print(values[ind]) # prints the most frequent element
ind = np.argpartition(-counts, kth=10)[:10]
print(values[ind]) # prints the 10 most frequent elements
If some element is as frequent as another one, this code will return only the first element.
If you're willing to use SciPy:
>>> from scipy.stats import mode
>>> mode([1,2,3,1,2,1,1,1,3,2,2,1])
(array([ 1.]), array([ 6.]))
>>> most_frequent = mode([1,2,3,1,2,1,1,1,3,2,2,1])[0][0]
>>> most_frequent
1.0
>>> # small array
>>> a = [12,3,65,33,12,3,123,888000]
>>>
>>> import collections
>>> collections.Counter(a).most_common()[0][0]
3
>>> %timeit collections.Counter(a).most_common()[0][0]
100000 loops, best of 3: 11.3 µs per loop
>>>
>>> import numpy
>>> numpy.bincount(a).argmax()
3
>>> %timeit numpy.bincount(a).argmax()
100 loops, best of 3: 2.84 ms per loop
>>>
>>> import scipy.stats
>>> scipy.stats.mode(a)[0][0]
3.0
>>> %timeit scipy.stats.mode(a)[0][0]
10000 loops, best of 3: 172 µs per loop
>>>
>>> from collections import defaultdict
>>> def jjc(l):
... d = defaultdict(int)
... for i in a:
... d[i] += 1
... return sorted(d.iteritems(), key=lambda x: x[1], reverse=True)[0]
...
>>> jjc(a)[0]
3
>>> %timeit jjc(a)[0]
100000 loops, best of 3: 5.58 µs per loop
>>>
>>> max(map(lambda val: (a.count(val), val), set(a)))[1]
12
>>> %timeit max(map(lambda val: (a.count(val), val), set(a)))[1]
100000 loops, best of 3: 4.11 µs per loop
>>>
Best is 'max' with 'set' for small arrays like the problem.
According to @David Sanders, if you increase the array size to something like 100,000 elements, the "max w/set" algorithm ends up being the worst by far whereas the "numpy bincount" method is the best.
Starting in Python 3.4
, the standard library includes the statistics.mode
function to return the single most common data point.
from statistics import mode
mode([1, 2, 3, 1, 2, 1, 1, 1, 3, 2, 2, 1])
# 1
If there are multiple modes with the same frequency, statistics.mode
returns the first one encountered.
Starting in Python 3.8
, the statistics.multimode
function returns a list of the most frequently occurring values in the order they were first encountered:
from statistics import multimode
multimode([1, 2, 3, 1, 2])
# [1, 2]
Also if you want to get most frequent value(positive or negative) without loading any modules you can use the following code:
lVals = [1,2,3,1,2,1,1,1,3,2,2,1]
print max(map(lambda val: (lVals.count(val), val), set(lVals)))
While most of the answers above are useful, in case you: 1) need it to support non-positive-integer values (e.g. floats or negative integers ;-)), and 2) aren't on Python 2.7 (which collections.Counter requires), and 3) prefer not to add the dependency of scipy (or even numpy) to your code, then a purely python 2.6 solution that is O(nlogn) (i.e., efficient) is just this:
from collections import defaultdict
a = [1,2,3,1,2,1,1,1,3,2,2,1]
d = defaultdict(int)
for i in a:
d[i] += 1
most_frequent = sorted(d.iteritems(), key=lambda x: x[1], reverse=True)[0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With