Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas diff() functionality on two columns in a dataframe

I have a data frame in which column A is the start time of an activity and column B is the finish time of that activity, and each row represents an activity (rows are arranged chronologically). I want to compute the difference in time between the end of one activity and the start of the next activity, i.e. df[i+1][A] - df[i][B].

Is there a Pandas function to do this (the only thing I can find is diff(), but that only appears to work on a single column).

like image 626
derNincompoop Avatar asked Jun 15 '14 18:06

derNincompoop


People also ask

How do you find the difference between two columns in pandas?

Difference between rows or columns of a pandas DataFrame object is found using the diff() method. The axis parameter decides whether difference to be calculated is between rows or between columns.

What does diff () do in pandas?

The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.

What does diff () do in Python?

diff() is used to find the first discrete difference of objects over the given axis. We can provide a period value to shift for forming the difference. axis : Take difference over rows (0) or columns (1).


1 Answers

You can shift A column first:

df['A'].shift(-1) - df['B']

like image 73
Happy001 Avatar answered Oct 06 '22 23:10

Happy001