Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted percentile using numpy

Is there a way to use the numpy.percentile function to compute weighted percentile? Or is anyone aware of an alternative python function to compute weighted percentile?

thanks!

like image 671
user308827 Avatar asked Feb 18 '14 03:02

user308827


People also ask

What is weighted percentile?

The weighted percentile method In addition to the percentile function, there is also a weighted percentile, where the percentage in the total weight is counted instead of the total number. There is no standard function for a weighted percentile. One method extends the above approach in a natural way.

How do you find the 95th percentile in Python?

Note that when using the pandas quantile() function pass the value of the nth percentile as a fractional value. For example, pass 0.95 to get the 95th percentile value.


1 Answers

Completely vectorized numpy solution

Here is the code I use. It's not an optimal one (which I'm unable to write with numpy), but still much faster and more reliable than accepted solution

def weighted_quantile(values, quantiles, sample_weight=None,                        values_sorted=False, old_style=False):     """ Very close to numpy.percentile, but supports weights.     NOTE: quantiles should be in [0, 1]!     :param values: numpy.array with data     :param quantiles: array-like with many quantiles needed     :param sample_weight: array-like of the same length as `array`     :param values_sorted: bool, if True, then will avoid sorting of         initial array     :param old_style: if True, will correct output to be consistent         with numpy.percentile.     :return: numpy.array with computed quantiles.     """     values = np.array(values)     quantiles = np.array(quantiles)     if sample_weight is None:         sample_weight = np.ones(len(values))     sample_weight = np.array(sample_weight)     assert np.all(quantiles >= 0) and np.all(quantiles <= 1), \         'quantiles should be in [0, 1]'      if not values_sorted:         sorter = np.argsort(values)         values = values[sorter]         sample_weight = sample_weight[sorter]      weighted_quantiles = np.cumsum(sample_weight) - 0.5 * sample_weight     if old_style:         # To be convenient with numpy.percentile         weighted_quantiles -= weighted_quantiles[0]         weighted_quantiles /= weighted_quantiles[-1]     else:         weighted_quantiles /= np.sum(sample_weight)     return np.interp(quantiles, weighted_quantiles, values) 

Examples:

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.])

array([ 1. , 3.2, 9. ])

weighted_quantile([1, 2, 9, 3.2, 4], [0.0, 0.5, 1.], sample_weight=[2, 1, 2, 4, 1])

array([ 1. , 3.2, 9. ])

like image 191
Alleo Avatar answered Oct 03 '22 07:10

Alleo