Is there a baked-in Numpy/Scipy function to find the interquartile range? I can do it pretty easily myself, but mean()
exists which is basically sum/len
...
def IQR(dist): return np.percentile(dist, 75) - np.percentile(dist, 25)
running np. percentile(samples, [25, 50, 75]) returns the actual values from the list: Out[1]: array([12., 14., 22.]) However, the quartiles are Q1=10.0, Median=14, Q3=24.5 (you can also use this link to find the quartiles and median online).
The interquartile range, often denoted “IQR”, is a way to measure the spread of the middle 50% of a dataset. It is calculated as the difference between the first quartile* (the 25th percentile) and the third quartile (the 75th percentile) of a dataset.
np.percentile
takes multiple percentile arguments, and you are slightly better off doing:
q75, q25 = np.percentile(x, [75 ,25]) iqr = q75 - q25
or
iqr = np.subtract(*np.percentile(x, [75, 25]))
than making two calls to percentile
:
In [8]: x = np.random.rand(1e6) In [9]: %timeit q75, q25 = np.percentile(x, [75 ,25]); iqr = q75 - q25 10 loops, best of 3: 24.2 ms per loop In [10]: %timeit iqr = np.subtract(*np.percentile(x, [75, 25])) 10 loops, best of 3: 24.2 ms per loop In [11]: %timeit iqr = np.percentile(x, 75) - np.percentile(x, 25) 10 loops, best of 3: 33.7 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With