Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Subtracting two date columns and the result being an integer

I have two columns in a Pandas data frame that are dates.

I am looking to subtract one column from another and the result being the difference in numbers of days as an integer.

A peek at the data:

df_test.head(10) Out[20]:    First_Date Second Date 0 2016-02-09  2015-11-19 1 2016-01-06  2015-11-30 2        NaT  2015-12-04 3 2016-01-06  2015-12-08 4        NaT  2015-12-09 5 2016-01-07  2015-12-11 6        NaT  2015-12-12 7        NaT  2015-12-14 8 2016-01-06  2015-12-14 9        NaT  2015-12-15 

I have created a new column successfully with the difference:

df_test['Difference'] = df_test['First_Date'].sub(df_test['Second Date'], axis=0) df_test.head()          Out[22]:    First_Date Second Date  Difference 0 2016-02-09  2015-11-19     82 days 1 2016-01-06  2015-11-30     37 days 2        NaT  2015-12-04         NaT 3 2016-01-06  2015-12-08     29 days 4        NaT  2015-12-09         NaT 

However I am unable to get a numeric version of the result:

df_test['Difference'] = df_test[['Difference']].apply(pd.to_numeric)       df_test.head() Out[25]:    First_Date Second Date    Difference 0 2016-02-09  2015-11-19  7.084800e+15 1 2016-01-06  2015-11-30  3.196800e+15 2        NaT  2015-12-04           NaN 3 2016-01-06  2015-12-08  2.505600e+15 4        NaT  2015-12-09           NaN 
like image 847
Kevin Avatar asked Jun 15 '16 16:06

Kevin


People also ask

How do you subtract two date columns?

Step 1: In cell C2, use a usual subtraction method to subtract the First Date from the Second Date. Use the formula as =B2-A2. Step 2: Select the entire column C and click on the Home tab under the Number group section; select either General or Number formatting through the dropdown list to convert it into numbers.

How do I subtract two time columns in Pandas?

We create a Panda DataFrame with 3 columns. Then we set the values of the to and fr columns to Pandas timestamps. Next, we subtract the values from df.fr by df.to and convert the type to timedelta64 with astype and assign that to df.

How do you take the difference between two dates in Pandas?

Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.

How do you subtract two dates in Python?

Use the strptime(date_str, format) function to convert a date string into a datetime object as per the corresponding format . To get the difference between two dates, subtract date2 from date1.


2 Answers

How about:

df_test['Difference'] = (df_test['First_Date'] - df_test['Second Date']).dt.days 

This will return difference as int if there are no missing values(NaT) and float if there is.

Pandas have a rich documentation on Time series / date functionality and Time deltas

like image 179
Prayson W. Daniel Avatar answered Sep 19 '22 02:09

Prayson W. Daniel


You can divide column of dtype timedelta by np.timedelta64(1, 'D'), but output is not int, but float, because NaN values:

df_test['Difference'] = df_test['Difference'] / np.timedelta64(1, 'D') print (df_test)   First_Date Second Date  Difference 0 2016-02-09  2015-11-19        82.0 1 2016-01-06  2015-11-30        37.0 2        NaT  2015-12-04         NaN 3 2016-01-06  2015-12-08        29.0 4        NaT  2015-12-09         NaN 5 2016-01-07  2015-12-11        27.0 6        NaT  2015-12-12         NaN 7        NaT  2015-12-14         NaN 8 2016-01-06  2015-12-14        23.0 9        NaT  2015-12-15         NaN 

Frequency conversion.

like image 29
jezrael Avatar answered Sep 21 '22 02:09

jezrael