Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: iterative filtering a DataFrame's rows

Suppose I have a DataFrame like so,

df = pd.DataFrame([['x', 1, 2], ['x', 1, 3], ['y', 2, 2]], 
                  columns=['a', 'b', 'c'])

To select all rows where c == 2 and a == 'x', I could do something like,

df[(df['a'] == 'x') & (df['c'] == 2)]

Or I could iterative refine by making temporary variables,

df1 = df[df['a'] == 'x']
df2 = df1[df1['c'] == 2]

Is there a way to iterative refine on rows?

(
  df
  .refine(lambda row: row['a'] == 'x')     # this method doesn't exist
  .refine(lambda row: row['c'] == 2)
)
like image 547
duckworthd Avatar asked May 13 '26 10:05

duckworthd


1 Answers

While this isn't a solution for now, in pandas version 0.13 you'll be able to do

df.query('a == "x"').query('c == 2')

to achieve what you want.

You'll also be able to do

df['a == "x"']['c == 2']

and

df['a == "x" and c == 2']

What's wrong with

df[(df.a == 'x') & (df.c == 2)]

until 0.13?

like image 100
Phillip Cloud Avatar answered May 14 '26 22:05

Phillip Cloud



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!