I have a numpy array:
foo = array([3, 1, 4, 0, 1, 0])
I want the top 3 items. Calling
foo.argsort()[::-1][:3]
returns
array([2, 0, 4])
Notice values foo[1]
and foo[4]
are equal, so numpy.argsort()
handles the tie by returning the index of the item which appears last in the array; i.e. index 4.
For my application I can't have the tie breaking always bias the end of the array, so how can I implement a random tie break? That is, half the time I would get array([2, 0, 4])
, and the other half I would get array([2, 0, 1])
.
Here's one approach:
Use numpy.unique
to both sort the array and remove duplicate items. Pass the return_inverse
argument to get the indices into the sorted array that give the values of the original array. Then, you can get all of the indices of the tied items by finding the indices of the inverse array whose values are equal to the index into the unique array for that item.
For example:
foo = array([3, 1, 4, 0, 1, 0])
foo_unique, foo_inverse = unique(foo, return_inverse=True)
# Put largest items first
foo_unique = foo_unique[::-1]
foo_inverse = -foo_inverse + len(foo_unique) - 1
foo_top3 = foo_unique[:3]
# Get the indices into foo of the top item
first_indices = (foo_inverse == 0).nonzero()
# Choose one at random
first_random_idx = random.choice(first_indices)
second_indices = (foo_inverse == 1).nonzero()
second_random_idx = random.choice(second_indices)
# And so on...
numpy.unique
is implemented using argsort
, so a glance at its implementation might suggest a simpler approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With