Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I have need the N minimum (index) values in a numpy array

Hi I have an array with X amount of values in it I would like to locate the indexs of the ten smallest values. In this link they calculated the maximum effectively, How to get indices of N maximum values in a numpy array? however I cant comment on links yet so I'm having to repost the question.

I'm not sure which indices i need to change to achieve the minimum and not the maximum values. This is their code

In [1]: import numpy as np

In [2]: arr = np.array([1, 3, 2, 4, 5])

In [3]: arr.argsort()[-3:][::-1]
Out[3]: array([4, 3, 1]) 
like image 320
astrochris Avatar asked May 29 '13 15:05

astrochris


People also ask

How do you find the index of the smallest number in an array in Python?

Use the min() and index() Functions to Find the Index of the Minimum Element in a List in Python. In Python, we can use the min() function to find the smallest item in the iterable. Then, the index() function of the list can return the index of any given element in the list.

How do I select a specific index in a Numpy array?

To select an element from Numpy Array , we can use [] operator i.e. It will return the element at given index only.


3 Answers

If you call

arr.argsort()[:3]

It will give you the indices of the 3 smallest elements.

array([0, 2, 1], dtype=int64)

So, for n, you should call

arr.argsort()[:n]
like image 125
petrichor Avatar answered Oct 18 '22 19:10

petrichor


Since this question was posted, numpy has updated to include a faster way of selecting the smallest elements from an array using argpartition. It was first included in Numpy 1.8.

Using snarly's answer as inspiration, we can quickly find the k=3 smallest elements:

In [1]: import numpy as np

In [2]: arr = np.array([1, 3, 2, 4, 5])

In [3]: k = 3

In [4]: ind = np.argpartition(arr, k)[:k]

In [5]: ind
Out[5]: array([0, 2, 1])

In [6]: arr[ind]
Out[6]: array([1, 2, 3])

This will run in O(n) time because it does not need to do a full sort. If you need your answers sorted (Note: in this case the output array was in sorted order but that is not guaranteed) you can sort the output:

In [7]: sorted(arr[ind])
Out[7]: array([1, 2, 3])

This runs on O(n + k log k) because the sorting takes place on the smaller output list.

like image 30
Alex Avatar answered Oct 18 '22 19:10

Alex


I don't guarantee that this will be faster, but a better algorithm would rely on heapq.

import heapq
indices = heapq.nsmallest(10,np.nditer(arr),key=arr.__getitem__)

This should work in approximately O(N) operations whereas using argsort would take O(NlogN) operations. However, the other is pushed into highly optimized C, so it might still perform better. To know for sure, you'd need to run some tests on your actual data.

like image 8
mgilson Avatar answered Oct 18 '22 18:10

mgilson