Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternative to Scipy mode function in Numpy?

Is there another way in numpy to realize scipy.stats.mode function to get the most frequent values in ndarrays along axis?(without importing other modules) i.e.

import numpy as np
from scipy.stats import mode

a = np.array([[[ 0,  1,  2,  3,  4],
                  [ 5,  6,  7,  8,  9],
                  [10, 11, 12, 13, 14],
                  [15, 16, 17, 18, 19]],

                 [[ 0,  1,  2,  3,  4],
                  [ 5,  6,  7,  8,  9],
                  [10, 11, 12, 13, 14],
                  [15, 16, 17, 18, 19]],

                 [[40, 40, 42, 43, 44],
                  [45, 46, 47, 48, 49],
                  [50, 51, 52, 53, 54],
                  [55, 56, 57, 58, 59]]])

mode= mode(data, axis=0)
mode = mode[0]
print mode
>>>[ 0,  1,  2,  3,  4],
   [ 5,  6,  7,  8,  9],
   [10, 11, 12, 13, 14],
   [15, 16, 17, 18, 19]
like image 965
oops Avatar asked Sep 13 '12 03:09

oops


2 Answers

If you know there are not many different values (relative to the size of the input "itemArray"), something like this could be efficient:

uniqueValues = np.unique(itemArray).tolist()
uniqueCounts = [len(np.nonzero(itemArray == uv)[0])
                for uv in uniqueValues]

modeIdx = uniqueCounts.index(max(uniqueCounts))
mode = itemArray[modeIdx]

# All counts as a map
valueToCountMap = dict(zip(uniqueValues, uniqueCounts))
like image 171
cwa Avatar answered Oct 21 '22 01:10

cwa


The scipy.stats.mode function is defined with this code, which only relies on numpy:

def mode(a, axis=0):
    scores = np.unique(np.ravel(a))       # get ALL unique values
    testshape = list(a.shape)
    testshape[axis] = 1
    oldmostfreq = np.zeros(testshape)
    oldcounts = np.zeros(testshape)

    for score in scores:
        template = (a == score)
        counts = np.expand_dims(np.sum(template, axis),axis)
        mostfrequent = np.where(counts > oldcounts, score, oldmostfreq)
        oldcounts = np.maximum(counts, oldcounts)
        oldmostfreq = mostfrequent

    return mostfrequent, oldcounts

Source: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609

like image 34
Blender Avatar answered Oct 21 '22 01:10

Blender