I'm using a pandas series and I want to find the index value that represents the quantile. If I have: <pre class="prettyprint"><code>np.random.seed(8) s = pd.Series(np.random.rand(6), ['a', 'b', 'c', 'd', 'e', 'f']) s a 0.873429 b 0.968541 c 0.869195 d 0.530856 e 0.232728 f 0.011399 dtype: float64 </code></pre> And do <pre class="prettyprint"><code>s.quantile(.5) </code></pre> I get <pre class="prettyprint"><code>0.70002511588475946 </code></pre> What I want to know is what is the index value of <code>s</code> that represents the point just before that quantile value. In this case I know the index value should be <code>d</code>.

If you set the <code>interpolation</code> argument to <code>'lower'</code>, <code>'higher'</code>, or <code>'nearest'</code> then the problem can be solved a bit more simply as: <pre class="prettyprint"><code>s[s == s.quantile(.5, interpolation='lower')] </code></pre> I'd guess this method is a fair bit faster than piRSquared's solution as well

Use <code>sort_values</code>, reverse the order, find all that are less than or equal to the quantile calculated, then find the <code>idxmax</code>. <pre class="prettyprint"><code>(s.sort_values()[::-1] <= s.quantile(.5)).idxmax() </code></pre> Or: <pre class="prettyprint"><code>(s.sort_values(ascending=False) <= s.quantile(.5)).idxmax() </code></pre> We can functionalize it: <pre class="prettyprint"><code>def idxquantile(s, q=0.5, *args, **kwargs): qv = s.quantile(q, *args, **kwargs) return (s.sort_values()[::-1] <= qv).idxmax() idxquantile(s) </code></pre>

how to find the index for a quantile

I'm using a pandas series and I want to find the index value that represents the quantile.

If I have:

np.random.seed(8)
s = pd.Series(np.random.rand(6), ['a', 'b', 'c', 'd', 'e', 'f'])
s

a    0.873429
b    0.968541
c    0.869195
d    0.530856
e    0.232728
f    0.011399
dtype: float64

And do

s.quantile(.5)

I get

0.70002511588475946

What I want to know is what is the index value of s that represents the point just before that quantile value. In this case I know the index value should be d.

How is quantile calculated in pandas?

Pandas DataFrame quantile() Method The quantile() method calculates the quantile of the values in a given axis. Default axis is row. By specifying the column axis ( axis='columns' ), the quantile() method calculates the quantile column-wise and returns the mean value for each row.

How does Python find Quantiles?

In Python, the numpy. quantile() function takes an array and a number say q between 0 and 1. It returns the value at the q th quantile.

If you set the interpolation argument to 'lower', 'higher', or 'nearest' then the problem can be solved a bit more simply as:

s[s == s.quantile(.5, interpolation='lower')]

I'd guess this method is a fair bit faster than piRSquared's solution as well

Use sort_values, reverse the order, find all that are less than or equal to the quantile calculated, then find the idxmax.

(s.sort_values()[::-1] <= s.quantile(.5)).idxmax()

Or:

(s.sort_values(ascending=False) <= s.quantile(.5)).idxmax()

We can functionalize it:

def idxquantile(s, q=0.5, *args, **kwargs):
    qv = s.quantile(q, *args, **kwargs)
    return (s.sort_values()[::-1] <= qv).idxmax()

idxquantile(s)

how to find the index for a quantile

Tags:

python

pandas

Brian

People also ask

2 Answers

JoeTheShmoe

piRSquared

Recent Activity

Donate For Us

how to find the index for a quantile

Tags:

python

pandas

Brian

People also ask

2 Answers

JoeTheShmoe

piRSquared

Related questions

Recent Activity

Donate For Us