Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select rows in a DataFrame between two values, in Python Pandas?

Tags:

python

pandas

People also ask

How do you get between values in pandas?

Pandas Series: between() function The between() function is used to get boolean Series equivalent to left <= series <= right. This function returns a boolean vector containing True wherever the corresponding Series element is between the boundary values left and right. NA values are treated as False.


Consider also series between:

df = df[df['closing_price'].between(99, 101)]

You should use () to group your boolean vector to remove ambiguity.

df = df[(df['closing_price'] >= 99) & (df['closing_price'] <= 101)]

there is a nicer alternative - use query() method:

In [58]: df = pd.DataFrame({'closing_price': np.random.randint(95, 105, 10)})

In [59]: df
Out[59]:
   closing_price
0            104
1             99
2             98
3             95
4            103
5            101
6            101
7             99
8             95
9             96

In [60]: df.query('99 <= closing_price <= 101')
Out[60]:
   closing_price
1             99
5            101
6            101
7             99

UPDATE: answering the comment:

I like the syntax here but fell down when trying to combine with expresison; df.query('(mean + 2 *sd) <= closing_price <=(mean + 2 *sd)')

In [161]: qry = "(closing_price.mean() - 2*closing_price.std())" +\
     ...:       " <= closing_price <= " + \
     ...:       "(closing_price.mean() + 2*closing_price.std())"
     ...:

In [162]: df.query(qry)
Out[162]:
   closing_price
0             97
1            101
2             97
3             95
4            100
5             99
6            100
7            101
8             99
9             95

you can also use .between() method

emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")

emp[emp["Salary"].between(60000, 61000)]

Output

enter image description here


newdf = df.query('closing_price.mean() <= closing_price <= closing_price.std()')

or

mean = closing_price.mean()
std = closing_price.std()

newdf = df.query('@mean <= closing_price <= @std')