Here is a time series data like this,call it df:
'No' 'Date' 'Value' 0 600000 1999-11-10 1 1 600000 1999-11-11 1 2 600000 1999-11-12 1 3 600000 1999-11-15 1 4 600000 1999-11-16 1 5 600000 1999-11-17 1 6 600000 1999-11-18 0 7 600000 1999-11-19 1 8 600000 1999-11-22 1 9 600000 1999-11-23 1 10 600000 1999-11-24 1 11 600000 1999-11-25 0 12 600001 1999-11-26 1 13 600001 1999-11-29 1 14 600001 1999-11-30 0
I want to get the date range of the consecutive 'Value' of 1, so how can I get the final result as follows:
'No' 'BeginDate' 'EndDate' 'Consecutive' 0 600000 1999-11-10 1999-11-17 6 1 600000 1999-11-19 1999-11-24 4 2 600001 1999-11-26 1999-11-29 2
Pandas Series: repeat() function The repeat() function is used to repeat elements of a Series. Returns a new Series where each element of the current Series is repeated consecutively a given number of times. The number of repetitions for each element. This should be a non-negative integer.
Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.
diff() function. This function calculates the difference between two consecutive DataFrame elements. Parameters: periods: Represents periods to shift for computing difference, Integer type value.
This should do it
df['value_grp'] = (df.Values.diff(1) != 0).astype('int').cumsum()
value_grp will increment by one whenever Value changes. Below, you can extract the group results
pd.DataFrame({'BeginDate' : df.groupby('value_grp').Date.first(), 'EndDate' : df.groupby('value_grp').Date.last(), 'Consecutive' : df.groupby('value_grp').size(), 'No' : df.groupby('value_grp').No.first()}).reset_index(drop=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With