Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does np.percentile return NaN for high percentiles?

This code:

print len(my_series)
print np.percentile(my_series, 98)
print np.percentile(my_series, 99)

gives:

14221  # This is the series length
1644.2  # 98th percentile
nan  # 99th percentile?

Why does 98 work fine but 99 gives nan?

like image 780
Thomas Johnson Avatar asked Jun 15 '15 00:06

Thomas Johnson


People also ask

What is Nan percentile?

Percentile to compute which must be between 0 and 100 inclusive. axis : int or sequence of int, optional. Axis along which the percentiles are computed. The default (None) is to compute the percentiles along a flattened version of the array.

What does NP percentile do?

percentile()function used to compute the nth percentile of the given data (array elements) along the specified axis. Parameters : arr :input array.

What is 95th percentile?

95th Percentile Calculation. The 95th percentile is a number that is greater than 95% of the numbers in a given set.


1 Answers

np.percentile treats nan's as very high/infinite numbers. So the high percentiles will be in the range where you will end up with a nan. In your case, between 1 and 2 percent of your data will be nan's (98th percentile will return you a number (which is not actually the 98th percentile of all the valid values) and the 99th will return you a nan).

To calculate the percentile without the nan's, you can use np.nanpercentile()

So:

print np.nanpercentile(my_series, 98)
print np.nanpercentile(my_series, 99)
like image 79
Niels Henkens Avatar answered Sep 21 '22 19:09

Niels Henkens