Drop pandas dataframe rows based on groupby() condition

Question

There is a pandas dataframe on input:

store_id item_id  items_sold        date
1          1          0        2015-12-28
1          2          1        2015-12-28
1          1          0        2015-12-28
2          2          0        2015-12-28
2          1          1        2015-12-29
2          2          1        2015-12-29
2          1          0        2015-12-29
3          1          0        2015-12-30
3          1          0        2015-12-30

I need to drop all rows with items that have never been sold in particular store: pairs (1,1), (3,1) of (store_id, item_id) in the dataframe

The output i expect is following:

store_id item_id  items_sold        date
1          2          1        2015-12-28
2          2          0        2015-12-28
2          1          1        2015-12-29
2          2          1        2015-12-29
2          1          0        2015-12-29

I've figured out how to find required pairs of (store_id, item_id) using pd.groupby()[].sum(), but stuck with dropping them from initial dataframe

MaxU - stop WAR against UA · Accepted Answer

is that what you want?

In [30]: df[df.groupby(['store_id', 'item_id'])['items_sold'].transform('sum') > 0]
Out[30]:
   store_id  item_id  items_sold        date
1         1        2           1  2015-12-28
3         2        2           0  2015-12-28
4         2        1           1  2015-12-29
5         2        2           1  2015-12-29
6         2        1           0  2015-12-29

Drop pandas dataframe rows based on groupby() condition

Tags:

python

pandas

Sasha Korekov

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us

Drop pandas dataframe rows based on groupby() condition

Tags:

python

pandas

Sasha Korekov

1 Answers

MaxU - stop WAR against UA

Related questions

Recent Activity

Donate For Us