Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to select rows in pandas dataframe based on between from another column

I have a dataframe organised like so:

    x   y   e
A 0 0.0 1.0 0.01
  1 0.1 0.9 0.03
  2 0.2 1.3 0.02
...
B 0 0.0 0.5 0.02
  1 0.1 0.6 0.02
  2 0.2 0.9 0.04
...

etc.

I would like to select rows of a A/B/etc. that fall between certain values in x.

This, for example, works:

p,q=0,1
indices=df.loc[("A"),"x"].between(p,q)
df.loc[("A"),"y"][indices]

Out:
[1.0,0.9]

However, this takes two lines of code, and uses chain indexing. However, what is to me the obvious way of one-lining this doesn't work:

p,q=0,1
df.loc[("A",df[("A"),"x"].between(p,q)),"y"]

Out:
[1.0,0.9]

How can I avoid chain indexing here?

(Also, if anyone wants to explain how to make the "x" column into the indices and thereby avoid the '0,1,2' indices, feel free!)

[Edited to clarify desired output]

like image 343
ARGratrex Avatar asked Sep 10 '25 12:09

ARGratrex


2 Answers

You can merge your 2 lines of code by using a lambda function.

>>> df.loc['A'].loc[lambda A: A['x'].between(p, q), 'y']

1    0.9
2    1.3
Name: y, dtype: float64

The output of your code:

indices=df.loc[("A"),"x"].between(p,q)
output=df.loc[("A"),"y"][indices]
print(output)

# Output
1    0.9
2    1.3
Name: y, dtype: float64
like image 140
Corralien Avatar answered Sep 13 '25 03:09

Corralien


You can do

cond = df['x'].between(0.05,0.15) & (df.index.get_level_values(level=0)=='A')
df[cond]
Out[284]: 
       x    y     e
A B                
A 1  0.1  0.9  0.03
like image 25
BENY Avatar answered Sep 13 '25 03:09

BENY