Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: filter rows of DataFrame with operator chaining

Most operations in pandas can be accomplished with operator chaining (groupby, aggregate, apply, etc), but the only way I've found to filter rows is via normal bracket indexing

df_filtered = df[df['column'] == value] 

This is unappealing as it requires I assign df to a variable before being able to filter on its values. Is there something more like the following?

df_filtered = df.mask(lambda x: x['column'] == value) 
like image 923
duckworthd Avatar asked Aug 08 '12 17:08

duckworthd


People also ask

How do you slice rows in Pandas?

Slicing Rows and Columns by Index PositionWhen slicing by index position in Pandas, the start index is included in the output, but the stop index is one step beyond the row you want to select. So the slice return row 0 and row 1, but does not return row 2. The second slice [:] indicates that all columns are required.


1 Answers

I'm not entirely sure what you want, and your last line of code does not help either, but anyway:

"Chained" filtering is done by "chaining" the criteria in the boolean index.

In [96]: df Out[96]:    A  B  C  D a  1  4  9  1 b  4  5  0  2 c  5  5  1  0 d  1  3  9  6  In [99]: df[(df.A == 1) & (df.D == 6)] Out[99]:    A  B  C  D d  1  3  9  6 

If you want to chain methods, you can add your own mask method and use that one.

In [90]: def mask(df, key, value):    ....:     return df[df[key] == value]    ....:  In [92]: pandas.DataFrame.mask = mask  In [93]: df = pandas.DataFrame(np.random.randint(0, 10, (4,4)), index=list('abcd'), columns=list('ABCD'))  In [95]: df.ix['d','A'] = df.ix['a', 'A']  In [96]: df Out[96]:    A  B  C  D a  1  4  9  1 b  4  5  0  2 c  5  5  1  0 d  1  3  9  6  In [97]: df.mask('A', 1) Out[97]:    A  B  C  D a  1  4  9  1 d  1  3  9  6  In [98]: df.mask('A', 1).mask('D', 6) Out[98]:    A  B  C  D d  1  3  9  6 
like image 195
Wouter Overmeire Avatar answered Sep 17 '22 08:09

Wouter Overmeire