Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to find mode in numpy array

I have a 2D array containing integers (both positive or negative). Each row represents the values over time for a particular spatial site, whereas each column represents values for various spatial sites for a given time.

So if the array is like:

1 3 4 2 2 7 5 2 2 1 4 1 3 3 2 2 1 1 

The result should be

1 3 2 2 2 1 

Note that when there are multiple values for mode, any one (selected randomly) may be set as mode.

I can iterate over the columns finding mode one at a time but I was hoping numpy might have some in-built function to do that. Or if there is a trick to find that efficiently without looping.

like image 718
Nik Avatar asked May 02 '13 05:05

Nik


People also ask

How do you find the Mode of a NumPy array?

If we want to use the NumPy package only to find the mode, we can use the numpy. unique() function. The numpy. unique() function takes an array as an input argument and returns an array of all the unique elements inside the input array.

How do you find the Mode of an array?

Step #1: Take the count array before summing its previous counts into next index. Step #2: The index with maximum value stored in it is the mode of given data. Step #3: In case there are more than one indexes with maximum value in it, all are results for mode so we can take any.

Does NumPy have a Mode?

In this article, we will discuss how to calculate the mode of the Numpy Array. Mode refers to the most repeating element in the array. We can find the mode from the NumPy array by using the following methods.

Are NumPy arrays more efficient?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


1 Answers

Check scipy.stats.mode() (inspired by @tom10's comment):

import numpy as np from scipy import stats  a = np.array([[1, 3, 4, 2, 2, 7],               [5, 2, 2, 1, 4, 1],               [3, 3, 2, 2, 1, 1]])  m = stats.mode(a) print(m) 

Output:

ModeResult(mode=array([[1, 3, 2, 2, 1, 1]]), count=array([[1, 2, 2, 2, 1, 2]])) 

As you can see, it returns both the mode as well as the counts. You can select the modes directly via m[0]:

print(m[0]) 

Output:

[[1 3 2 2 1 1]] 
like image 60
fgb Avatar answered Oct 05 '22 12:10

fgb