Find the most frequent number in a NumPy array

Q: How do you find most frequent strings in NumPy array?

Using bincount( ). argmax( ) function — We can get the most frequent element in numpy array using bincount function.

Q: How do you print the most frequent elements in an array in Python?

Write a Python program to get the most frequent element in a given list of numbers. Use set() to get the unique values in nums. Use max() to find the element that has the most appearances.

People also ask

How do I find the most common number in an array Python?

Make use of Python Counter which returns count of each element in the list. Thus, we simply find the most common element by using most_common() method.

How do you find the most frequent in an array?

Solution StepsmaxFreq = 0, mostFrequent = -1. Now we scan the sorted array using a loop till i < n. Inside the loop, we initialize a variable countFreq to track the frequency count i.e. countFreq = 1. We start from the first element (i =0) and search its consecutive occurrences using a loop till X[i] = X[i + 1].

How do you find most frequent strings in NumPy array?

Using bincount( ). argmax( ) function — We can get the most frequent element in numpy array using bincount function.

How do you print the most frequent elements in an array in Python?

Write a Python program to get the most frequent element in a given list of numbers. Use set() to get the unique values in nums. Use max() to find the element that has the most appearances.

If your list contains all non-negative ints, you should take a look at numpy.bincounts:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html

and then probably use np.argmax:

a = np.array([1,2,3,1,2,1,1,1,3,2,2,1])
counts = np.bincount(a)
print(np.argmax(counts))

For a more complicated list (that perhaps contains negative numbers or non-integer values), you can use np.histogram in a similar way. Alternatively, if you just want to work in python without using numpy, collections.Counter is a good way of handling this sort of data.

from collections import Counter
a = [1,2,3,1,2,1,1,1,3,2,2,1]
b = Counter(a)
print(b.most_common(1))

You may use

values, counts = np.unique(a, return_counts=True)

ind = np.argmax(counts)
print(values[ind])  # prints the most frequent element

ind = np.argpartition(-counts, kth=10)[:10]
print(values[ind])  # prints the 10 most frequent elements

If some element is as frequent as another one, this code will return only the first element.

If you're willing to use SciPy:

>>> from scipy.stats import mode
>>> mode([1,2,3,1,2,1,1,1,3,2,2,1])
(array([ 1.]), array([ 6.]))
>>> most_frequent = mode([1,2,3,1,2,1,1,1,3,2,2,1])[0][0]
>>> most_frequent
1.0

Performances (using iPython) for some solutions found here:

>>> # small array
>>> a = [12,3,65,33,12,3,123,888000]
>>> 
>>> import collections
>>> collections.Counter(a).most_common()[0][0]
3
>>> %timeit collections.Counter(a).most_common()[0][0]
100000 loops, best of 3: 11.3 µs per loop
>>> 
>>> import numpy
>>> numpy.bincount(a).argmax()
3
>>> %timeit numpy.bincount(a).argmax()
100 loops, best of 3: 2.84 ms per loop
>>> 
>>> import scipy.stats
>>> scipy.stats.mode(a)[0][0]
3.0
>>> %timeit scipy.stats.mode(a)[0][0]
10000 loops, best of 3: 172 µs per loop
>>> 
>>> from collections import defaultdict
>>> def jjc(l):
...     d = defaultdict(int)
...     for i in a:
...         d[i] += 1
...     return sorted(d.iteritems(), key=lambda x: x[1], reverse=True)[0]
... 
>>> jjc(a)[0]
3
>>> %timeit jjc(a)[0]
100000 loops, best of 3: 5.58 µs per loop
>>> 
>>> max(map(lambda val: (a.count(val), val), set(a)))[1]
12
>>> %timeit max(map(lambda val: (a.count(val), val), set(a)))[1]
100000 loops, best of 3: 4.11 µs per loop
>>>

Best is 'max' with 'set' for small arrays like the problem.

According to @David Sanders, if you increase the array size to something like 100,000 elements, the "max w/set" algorithm ends up being the worst by far whereas the "numpy bincount" method is the best.

Starting in Python 3.4, the standard library includes the statistics.mode function to return the single most common data point.

from statistics import mode

mode([1, 2, 3, 1, 2, 1, 1, 1, 3, 2, 2, 1])
# 1

If there are multiple modes with the same frequency, statistics.mode returns the first one encountered.

Starting in Python 3.8, the statistics.multimode function returns a list of the most frequently occurring values in the order they were first encountered:

from statistics import multimode

multimode([1, 2, 3, 1, 2])
# [1, 2]

Also if you want to get most frequent value(positive or negative) without loading any modules you can use the following code:

lVals = [1,2,3,1,2,1,1,1,3,2,2,1]
print max(map(lambda val: (lVals.count(val), val), set(lVals)))

While most of the answers above are useful, in case you: 1) need it to support non-positive-integer values (e.g. floats or negative integers ;-)), and 2) aren't on Python 2.7 (which collections.Counter requires), and 3) prefer not to add the dependency of scipy (or even numpy) to your code, then a purely python 2.6 solution that is O(nlogn) (i.e., efficient) is just this:

from collections import defaultdict

a = [1,2,3,1,2,1,1,1,3,2,2,1]

d = defaultdict(int)
for i in a:
  d[i] += 1
most_frequent = sorted(d.iteritems(), key=lambda x: x[1], reverse=True)[0]

Related questions
                            
                                Maven equivalent for python [closed]
                            
                                Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"
                            
                                Python string prints as [u'String']
                            
                                Get selected subcommand with argparse
                            
                                How to copy a 2D array into a 3rd dimension, N times?
                            
                                TypeError: 'dict_keys' object does not support indexing
                            
                                Use .corr to get the correlation between two columns
                            
                                Can you define aliases for imported modules in Python?
                            
                                initialize a numpy array
                            
                                Why is the order in dictionaries and sets arbitrary?
                            
                                Python: Using .format() on a Unicode-escaped string
                            
                                Matplotlib: draw grid lines behind other graph elements
                            
                                Why are Python's arrays slow?
                            
                                Call a function with argument list in python
                            
                                Does 'finally' always execute in Python?
                            
                                Check if value already exists within list of dictionaries?
                            
                                Changing default encoding of Python?
                            
                                How to check if a word is an English word with Python?
                            
                                How to save and load cookies using Python + Selenium WebDriver
                            
                                Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find the most frequent number in a NumPy array

Tags:

python

numpy

People also ask

Performances (using iPython) for some solutions found here:

Recent Activity

Donate For Us