How to split a dataframe according to a boolean criterion?

Question

Suppose that df is a pandas dataframe. I want to split it into two dataframes according to some criterion. The best way I've found for doing this is something like

df0, df1 = [v for _, v in df.groupby(df['class'] != 'special')]

In the above example, the criterion is the argument to the groupby method. The resulting df0 consists of the sub-dataframe where the class field has value 'special', and df1 is basically the complement of df0. (Unfortunately, with this construct, the sub-dataframe consisting of the items that fail the criterion are returned first, which is not intuitive.)

The above construct has the drawback that it is not particularly readable, certainly not as readable as, for instance, some hypothetical splitby method like

df0, df1 = df.splitby(df['class'] == 'special')

Since splitting a dataframe like this is something I often need to do, I figure that there may be a built-in function, or maybe an established idiom, for doing this. If so, please let me know.

Andy Hayden · Accepted Answer

I think the most readable way is to do this is:

m = df['class'] != 'special'
a, b = df[m], df[~m]

I haven't come across a special method for this...

How to split a dataframe according to a boolean criterion?

Tags:

pandas

kjo

1 Answers

Andy Hayden

Recent Activity

Donate For Us

How to split a dataframe according to a boolean criterion?

Tags:

pandas

kjo

1 Answers

Andy Hayden

Related questions

Recent Activity

Donate For Us