I have a data frame called df
:
Date Sales
01/01/2020 812
02/01/2020 981
03/01/2020 923
04/01/2020 1033
05/01/2020 988
... ...
How can I get the first occurrence of 7 consecutive days with sales above 1000?
This is what I am doing to find the rows where sales is above 1000:
In [221]: df.loc[df["sales"] >= 1000]
Out [221]:
Date Sales
04/01/2020 1033
08/01/2020 1008
09/01/2020 1091
17/01/2020 1080
18/01/2020 1121
19/01/2020 1098
... ...
Find Maximum Element in Pandas DataFrame's Row Finding the max element of each DataFrame row relies on the max() method as well, but we set the axis argument to 1 . The default value for the axis argument is 0. If the axis equals to 0, the max() method will find the max element of each column.
You can assign a unique identifier per consecutive days, group by them, and return the first value per group (with a previous filter of values > 1000):
df = df.query('Sales > 1000').copy()
df['grp_date'] = df.Date.diff().dt.days.fillna(1).ne(1).cumsum()
df.groupby('grp_date').head(7).reset_index(drop=True)
where you can change the value of head
parameter to the first n
rows from consecutive days.
Note: you may need to use pd.to_datetime(df.Date, format='%d/%m/%Y')
to convert dates from strings to pandas datetime, and sort them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With