Is there another way in numpy to realize scipy.stats.mode function to get the most frequent values in ndarrays along axis?(without importing other modules) i.e.
import numpy as np
from scipy.stats import mode
a = np.array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[40, 40, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
mode= mode(data, axis=0)
mode = mode[0]
print mode
>>>[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]
If you know there are not many different values (relative to the size of the input "itemArray"), something like this could be efficient:
uniqueValues = np.unique(itemArray).tolist()
uniqueCounts = [len(np.nonzero(itemArray == uv)[0])
for uv in uniqueValues]
modeIdx = uniqueCounts.index(max(uniqueCounts))
mode = itemArray[modeIdx]
# All counts as a map
valueToCountMap = dict(zip(uniqueValues, uniqueCounts))
The scipy.stats.mode
function is defined with this code, which only relies on numpy:
def mode(a, axis=0):
scores = np.unique(np.ravel(a)) # get ALL unique values
testshape = list(a.shape)
testshape[axis] = 1
oldmostfreq = np.zeros(testshape)
oldcounts = np.zeros(testshape)
for score in scores:
template = (a == score)
counts = np.expand_dims(np.sum(template, axis),axis)
mostfrequent = np.where(counts > oldcounts, score, oldmostfreq)
oldcounts = np.maximum(counts, oldcounts)
oldmostfreq = mostfrequent
return mostfrequent, oldcounts
Source: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With