Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conversions of np.timedelta64 to days, weeks, months, etc

When I compute the difference between two pandas datetime64 dates I get np.timedelta64. Is there any easy way to convert these deltas into representations like hours, days, weeks, etc.?

I could not find any methods in np.timedelta64 that facilitate conversions between different units, but it looks like Pandas seems to know how to convert these units to days when printing timedeltas (e.g. I get: 29 days, 23:20:00 in the string representation dataframes). Any way to access this functionality ?

Update:

Strangely, none of the following work:

> df['column_with_times'].days
> df['column_with_times'].apply(lambda x: x.days)

but this one does:

df['column_with_times'][0].days
like image 475
Amelio Vazquez-Reina Avatar asked Sep 04 '14 18:09

Amelio Vazquez-Reina


1 Answers

pandas stores timedelta data in the numpy timedelta64[ns] type, but also provides the Timedelta type to wrap this for more convenience (eg to provide such accessors of the days, hours, .. and other components).

In [41]: timedelta_col = pd.Series(pd.timedelta_range('1 days', periods=5, freq='2 h'))

In [42]: timedelta_col
Out[42]:
0   1 days 00:00:00
1   1 days 02:00:00
2   1 days 04:00:00
3   1 days 06:00:00
4   1 days 08:00:00
dtype: timedelta64[ns]

To access the different components of a full column (series), you have to use the .dt accessor. For example:

In [43]: timedelta_col.dt.hours
Out[43]:
0    0
1    2
2    4
3    6
4    8
dtype: int64

With timedelta_col.dt.components you get a frame with all the different components (days to nanoseconds) as different columns.
When accessing one value of the column above, this gives back a Timedelta, and on this you don't need to use the dt accessor, but you can access directly the components:

In [45]: timedelta_col[0]
Out[45]: Timedelta('1 days 00:00:00')

In [46]: timedelta_col[0].days
Out[46]: 1L

So the .dt accessor provides access to the attributes of the Timedelta scalar, but on the full column. That is the reason you see that df['column_with_times'][0].days works but df['column_with_times'].days not.
The reason that df['column_with_times'].apply(lambda x: x.days) does not work is that apply is given the timedelta64 values (and not the Timedelta pandas type), and these don't have such attributes.

like image 116
joris Avatar answered Oct 28 '22 10:10

joris