I have a pandas Series called 'df' as follows
value
date_time_index
2015-10-28 01:54:00 1.0
2015-10-28 01:55:00 1.0
2015-10-28 01:56:00 1.0
2015-10-28 01:57:00 1.0
2015-10-28 01:58:00 1.0
and I just want a new column with the difference in seconds between consecutive rows, how can I do this?
Note: The type is
type(df.index[1])
given as
pandas.tslib.Timestamp
diff() function. This function calculates the difference between two consecutive DataFrame elements.
Because of this, we can easily use the shift method to subtract between rows. What is this? The Pandas shift method offers a pre-step to calculating the difference between two rows by letting you see the data directly. The Pandas diff method simply calculates the difference, thereby abstracting the calculation.
Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple '–' operator. Given time can be converted to pandas timestamp using pandas. Timestamp() method.
The results show that apply massively outperforms iterrows . As mentioned previously, this is because apply is optimized for looping through dataframe rows much quicker than iterrows does. While slower than apply , itertuples is quicker than iterrows , so if looping is required, try implementing itertuples instead.
I'd do it like this:
df.index.to_series().diff().dt.total_seconds().fillna(0)
date_time_index
2015-10-28 01:54:00 0.0
2015-10-28 01:55:00 60.0
2015-10-28 01:56:00 60.0
2015-10-28 01:57:00 60.0
2015-10-28 01:58:00 60.0
Name: date_time_index, dtype: float64
I think Ive worked it out using
df['temp_index'] = df.index
df['Delta'] = df['temp_index'].diff().astype('timedelta64[m]')
in minutes rather than seconds (change m to s for seconds)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With