Suppose that df
is a pandas
dataframe. I want to split it into two dataframes according to some criterion. The best way I've found for doing this is something like
df0, df1 = [v for _, v in df.groupby(df['class'] != 'special')]
In the above example, the criterion is the argument to the groupby
method. The resulting df0
consists of the sub-dataframe where the class
field has value 'special'
, and df1
is basically the complement of df0
. (Unfortunately, with this construct, the sub-dataframe consisting of the items that fail the criterion are returned first, which is not intuitive.)
The above construct has the drawback that it is not particularly readable, certainly not as readable as, for instance, some hypothetical splitby
method like
df0, df1 = df.splitby(df['class'] == 'special')
Since splitting a dataframe like this is something I often need to do, I figure that there may be a built-in function, or maybe an established idiom, for doing this. If so, please let me know.
I think the most readable way is to do this is:
m = df['class'] != 'special'
a, b = df[m], df[~m]
I haven't come across a special method for this...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With