I have two columns in a Pandas data frame that are dates.
I am looking to subtract one column from another and the result being the difference in numbers of days as an integer.
A peek at the data:
df_test.head(10) Out[20]: First_Date Second Date 0 2016-02-09 2015-11-19 1 2016-01-06 2015-11-30 2 NaT 2015-12-04 3 2016-01-06 2015-12-08 4 NaT 2015-12-09 5 2016-01-07 2015-12-11 6 NaT 2015-12-12 7 NaT 2015-12-14 8 2016-01-06 2015-12-14 9 NaT 2015-12-15
I have created a new column successfully with the difference:
df_test['Difference'] = df_test['First_Date'].sub(df_test['Second Date'], axis=0) df_test.head() Out[22]: First_Date Second Date Difference 0 2016-02-09 2015-11-19 82 days 1 2016-01-06 2015-11-30 37 days 2 NaT 2015-12-04 NaT 3 2016-01-06 2015-12-08 29 days 4 NaT 2015-12-09 NaT
However I am unable to get a numeric version of the result:
df_test['Difference'] = df_test[['Difference']].apply(pd.to_numeric) df_test.head() Out[25]: First_Date Second Date Difference 0 2016-02-09 2015-11-19 7.084800e+15 1 2016-01-06 2015-11-30 3.196800e+15 2 NaT 2015-12-04 NaN 3 2016-01-06 2015-12-08 2.505600e+15 4 NaT 2015-12-09 NaN
Step 1: In cell C2, use a usual subtraction method to subtract the First Date from the Second Date. Use the formula as =B2-A2. Step 2: Select the entire column C and click on the Home tab under the Number group section; select either General or Number formatting through the dropdown list to convert it into numbers.
We create a Panda DataFrame with 3 columns. Then we set the values of the to and fr columns to Pandas timestamps. Next, we subtract the values from df.fr by df.to and convert the type to timedelta64 with astype and assign that to df.
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
Use the strptime(date_str, format) function to convert a date string into a datetime object as per the corresponding format . To get the difference between two dates, subtract date2 from date1.
How about:
df_test['Difference'] = (df_test['First_Date'] - df_test['Second Date']).dt.days
This will return difference as int
if there are no missing values(NaT
) and float
if there is.
Pandas have a rich documentation on Time series / date functionality and Time deltas
You can divide column of dtype
timedelta
by np.timedelta64(1, 'D')
, but output is not int
, but float
, because NaN
values:
df_test['Difference'] = df_test['Difference'] / np.timedelta64(1, 'D') print (df_test) First_Date Second Date Difference 0 2016-02-09 2015-11-19 82.0 1 2016-01-06 2015-11-30 37.0 2 NaT 2015-12-04 NaN 3 2016-01-06 2015-12-08 29.0 4 NaT 2015-12-09 NaN 5 2016-01-07 2015-12-11 27.0 6 NaT 2015-12-12 NaN 7 NaT 2015-12-14 NaN 8 2016-01-06 2015-12-14 23.0 9 NaT 2015-12-15 NaN
Frequency conversion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With