Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Pandas Series - Difference between consecutive datetime rows in seconds

I have a pandas Series called 'df' as follows

                       value    
date_time_index         
2015-10-28 01:54:00     1.0 
2015-10-28 01:55:00     1.0 
2015-10-28 01:56:00     1.0 
2015-10-28 01:57:00     1.0 
2015-10-28 01:58:00     1.0 

and I just want a new column with the difference in seconds between consecutive rows, how can I do this?

Note: The type is

 type(df.index[1])

given as

 pandas.tslib.Timestamp
like image 380
Runner Bean Avatar asked Aug 10 '16 02:08

Runner Bean


People also ask

How do you tell the difference between consecutive rows in pandas?

diff() function. This function calculates the difference between two consecutive DataFrame elements.

How do you subtract consecutive rows in pandas?

Because of this, we can easily use the shift method to subtract between rows. What is this? The Pandas shift method offers a pre-step to calculating the difference between two rows by letting you see the data directly. The Pandas diff method simply calculates the difference, thereby abstracting the calculation.

How do I compare Panda timestamps?

Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple '–' operator. Given time can be converted to pandas timestamp using pandas. Timestamp() method.

Is Iterrows faster than apply?

The results show that apply massively outperforms iterrows . As mentioned previously, this is because apply is optimized for looping through dataframe rows much quicker than iterrows does. While slower than apply , itertuples is quicker than iterrows , so if looping is required, try implementing itertuples instead.


2 Answers

I'd do it like this:

df.index.to_series().diff().dt.total_seconds().fillna(0)

date_time_index
2015-10-28 01:54:00     0.0
2015-10-28 01:55:00    60.0
2015-10-28 01:56:00    60.0
2015-10-28 01:57:00    60.0
2015-10-28 01:58:00    60.0
Name: date_time_index, dtype: float64
like image 62
piRSquared Avatar answered Oct 03 '22 07:10

piRSquared


I think Ive worked it out using

df['temp_index'] = df.index
df['Delta'] = df['temp_index'].diff().astype('timedelta64[m]')

in minutes rather than seconds (change m to s for seconds)

like image 30
Runner Bean Avatar answered Oct 03 '22 07:10

Runner Bean