Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get the index of a specific percentile in numpy / scipy?

I have looked this answer which explains how to compute the value of a specific percentile, and this answer which explains how to compute the percentiles that correspond to each element.

  • Using the first solution, I can compute the value and scan the original array to find the index.

  • Using the second solution, I can scan the entire output array for the percentile I'm looking for.

However, both require an additional scan if I want to know the index (in the original array) that corresponds to a particular percentile (or the index containing the element closest to that index).

Is there is more direct or built-in way to get the index which corresponds to a percentile?

Note: My array is not sorted and I want the index in the original, unsorted array.

like image 780
merlin2011 Avatar asked Sep 27 '14 01:09

merlin2011


1 Answers

It is a little convoluted, but you can get what you are after with np.argpartition. Lets take an easy array and shuffle it:

>>> a = np.arange(10)
>>> np.random.shuffle(a)
>>> a
array([5, 6, 4, 9, 2, 1, 3, 0, 7, 8])

If you want to find e.g. the index of quantile 0.25, this would correspond to the item in position idx of the sorted array:

>>> idx = 0.25 * (len(a) - 1)
>>> idx
2.25

You need to figure out how to round that to an int, say you go with nearest integer:

>>> idx = int(idx + 0.5)
>>> idx
2

If you now call np.argpartition, this is what you get:

>>> np.argpartition(a, idx)
array([7, 5, 4, 3, 2, 1, 6, 0, 8, 9], dtype=int64)
>>> np.argpartition(a, idx)[idx]
4
>>> a[np.argpartition(a, idx)[idx]]
2

It is easy to check that these last two expressions are, respectively, the index and the value of the .25 quantile.

like image 133
Jaime Avatar answered Sep 28 '22 05:09

Jaime