Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert integer series to timedelta in pandas

Tags:

python

pandas

I have a data frame in pandas which includes number of days since an event occurred. I want to create a new column that calculates the date of the event by subtracting the number of days from the current date. Every time I attempt to apply pd.offsets.Day or pd.Timedelta I get an error stating that Series are an unsupported type. This also occurs when I use apply. When I use map I receive a runtime error saying "maximum recursion depth exceeded while calling a Python object".

For example, assume my data frame looked like this:

index    days_since_event
0        5
1        7
2        3
3        6
4        0

I want to create a new column with the date of the event, so my expected outcome (using today's date of 12/29/2015)

index    days_since_event    event_date
0        5                   2015-12-24
1        7                   2015-12-22
2        3                   2015-12-26
3        6                   2015-12-23
4        0                   2015-12-29

I have attempted multiple ways to do this, but have received errors for each.

One method I tried was:

now = pd.datetime.date(pd.datetime.now())
df['event_date'] = now - df.days_since_event.apply(pd.offsets.Day)

With this I received an error saying that Series are an unsupported type.

I tried the above with .map instead of .apply, and received the error that "maximum recursion depth exceeded while calling a Python object".

I also attempted to convert the days into timedelta, such as:

df.days_since_event = (dt.timedelta(days = df.days_since_event)).apply

This also received an error referencing the series being an unsupported type.

like image 457
abdouglass Avatar asked Dec 29 '15 21:12

abdouglass


People also ask

How do you convert int to datetime in Python?

Use pandas. to_datetime() to Convert Integer to Date & Time Format. Let's suppose that your integers contain both the date and time. In that case, the format should be specify is '%Y%m%d%H%M%S' .

How do I convert datetime to Timedelta?

The to_timedelta() function is used to convert argument to datetime. Timedeltas are absolute differences in times, expressed in difference units (e.g. days, hours, minutes, seconds). This method converts an argument from a recognized timedelta format / value into a Timedelta type. The data to be converted to timedelta.


2 Answers

First, to convert the column with integers to a timedelta, you can use to_timedelta:

In [60]: pd.to_timedelta(df['days_since_event'], unit='D')
Out[60]:
0   5 days
1   7 days
2   3 days
3   6 days
4   0 days
Name: days_since_event, dtype: timedelta64[ns]

Then you can create a new column with the current date and substract those timedelta's:

In [62]: df['event_date'] = pd.Timestamp('2015-12-29')

In [63]: df['event_date'] = df['event_date'] -  pd.to_timedelta(df['days_since_event'], unit='D')

In [64]: df['event_date']
Out[64]:
0   2015-12-24
1   2015-12-22
2   2015-12-26
3   2015-12-23
4   2015-12-29
dtype: datetime64[ns]
like image 88
joris Avatar answered Oct 04 '22 04:10

joris


Just to follow up with joris' response, you can convert an int or a float into whatever time unit you want with pd.to_timedelta(x, unit=''), changing only the entry for unit=:

# Years, Months, Days:
pd.to_timedelta(3.5, unit='Y') # returns '1095 days 17:27:36'
pd.to_timedelta(3.5, unit='M') # returns '91 days 07:27:18'
pd.to_timedelta(3.5, unit='D') # returns '3 days 12:00:00'

# Hours, Minutes, Seconds:
pd.to_timedelta(3.5, unit='h') # returns '0 days 03:30:00'
pd.to_timedelta(3.5, unit='m') # returns '0 days 00:03:30'
pd.to_timedelta(3.5, unit='s') # returns '0 days 00:00:03.50'

Note that mathematical operations are legal once correctly formatted:

pd.to_timedelta(3.5, unit='h') - pd.to_timedelta(3.25, unit='h') # returns '0 days 00:15:00'
like image 40
Gabriel123 Avatar answered Oct 04 '22 04:10

Gabriel123