Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas how to get rows with consecutive dates and sales more than 1000?

I have a data frame called df:

Date        Sales
01/01/2020    812
02/01/2020    981
03/01/2020    923
04/01/2020   1033
05/01/2020    988
...           ...

How can I get the first occurrence of 7 consecutive days with sales above 1000?

This is what I am doing to find the rows where sales is above 1000:

In  [221]:  df.loc[df["sales"] >= 1000]
Out [221]: 
Date        Sales
04/01/2020   1033
08/01/2020   1008
09/01/2020   1091
17/01/2020   1080
18/01/2020   1121
19/01/2020   1098
...           ...
like image 759
anInputName Avatar asked Dec 04 '20 15:12

anInputName


People also ask

How do you get max rows in pandas?

Find Maximum Element in Pandas DataFrame's Row Finding the max element of each DataFrame row relies on the max() method as well, but we set the axis argument to 1 . The default value for the axis argument is 0. If the axis equals to 0, the max() method will find the max element of each column.


1 Answers

You can assign a unique identifier per consecutive days, group by them, and return the first value per group (with a previous filter of values > 1000):

df = df.query('Sales > 1000').copy()
df['grp_date'] = df.Date.diff().dt.days.fillna(1).ne(1).cumsum()
df.groupby('grp_date').head(7).reset_index(drop=True)

where you can change the value of head parameter to the first n rows from consecutive days.

Note: you may need to use pd.to_datetime(df.Date, format='%d/%m/%Y') to convert dates from strings to pandas datetime, and sort them.

like image 72
Cainã Max Couto-Silva Avatar answered Nov 07 '22 07:11

Cainã Max Couto-Silva