Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to make argsort result to be random between equal values?

Say you have a numpy vector [0,3,1,1,1] and you run argsort you will get [0,2,3,4,1] but all the ones are the same! What I want is an efficient way to shuffle indices of identical values. Any idea how to do that without a while loop with two indices on the sorted vector?

numpy.array([0,3,1,1,1]).argsort()
like image 422
Hanan Shteingart Avatar asked Nov 25 '13 16:11

Hanan Shteingart


Video Answer


1 Answers

This is a bit of a hack, but if your array contains integers only you could add random values and argsort the result. np.random.rand gives you results in [0, 1) so in this case you're guaranteed to maintain the order for non-identical elements.

>>> import numpy as np
>>> arr = np.array([0,3,1,1,1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 4, 3, 2, 1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 3, 4, 2, 1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 3, 4, 2, 1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 2, 3, 4, 1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 2, 3, 4, 1])
>>> np.argsort(arr + np.random.rand(*arr.shape))
array([0, 4, 2, 3, 1])

Here we see index 0 is always first in the argsort result and index 1 is last, but the rest of the results are in a random order.

In general you could generate random values bounded by np.diff(np.sort(arr)).max(), but you might run into precision issues at some point.

like image 162
YXD Avatar answered Nov 09 '22 10:11

YXD