This code:
print len(my_series)
print np.percentile(my_series, 98)
print np.percentile(my_series, 99)
gives:
14221 # This is the series length
1644.2 # 98th percentile
nan # 99th percentile?
Why does 98 work fine but 99 gives nan
?
Percentile to compute which must be between 0 and 100 inclusive. axis : int or sequence of int, optional. Axis along which the percentiles are computed. The default (None) is to compute the percentiles along a flattened version of the array.
percentile()function used to compute the nth percentile of the given data (array elements) along the specified axis. Parameters : arr :input array.
95th Percentile Calculation. The 95th percentile is a number that is greater than 95% of the numbers in a given set.
np.percentile treats nan's as very high/infinite numbers. So the high percentiles will be in the range where you will end up with a nan. In your case, between 1 and 2 percent of your data will be nan's (98th percentile will return you a number (which is not actually the 98th percentile of all the valid values) and the 99th will return you a nan).
To calculate the percentile without the nan's, you can use np.nanpercentile()
So:
print np.nanpercentile(my_series, 98)
print np.nanpercentile(my_series, 99)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With