Grouping by with Where conditions in Pandas

Question

Have a dataframe like this: enter image description here

I created column 'dif_pause' based on subtracting 'pause_end' and 'pause_start' column values and doing the mean value aggregation using groupby () function just like this:

pauses['dif_pause'] = pauses['pause_end'] - pauses['pause_start']
pauses['dif_pause'].astype(dt.timedelta).map(lambda x: np.nan if pd.isnull(x) else x.days)

pauses_df=pauses.groupby(["subscription_id"])["dif_pause"].mean().reset_index(name="avg_pause")

I'd like to include in the groupby section the checking whether pause_end>pause_start (some equialent of WHERE clause in SQL). How can one do that?

Thanks.

jezrael · Accepted Answer

It seems you need query or boolean indexing first for filtering:

pauses.query("pause_end > pause_start")
       .groupby(["subscription_id"])["dif_pause"].mean().reset_index(name="avg_pause")

pauses[pauses["pause_end"] > pauses["pause_start"]]
      .groupby(["subscription_id"])["dif_pause"].mean().reset_index(name="avg_pause")

Grouping by with Where conditions in Pandas

Tags:

python

pandas

where-clause

pandas-groupby

Keithx

1 Answers

jezrael

Recent Activity

Donate For Us

Grouping by with Where conditions in Pandas

Tags:

python

pandas

where-clause

pandas-groupby

Keithx

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us