In a similar vein to this question, I have a numpy.timedelta64
column in a pandas DataFrame. As per this answer to the aforementioned question, there is a function pandas.tslib.repr_timedelta64
which nicely displays a timedelta in days, hours:minutes:seconds. I would like to format them only in days and hours.
So what I've got is the following:
def silly_format(hours):
(days, hours) = divmod(hours, 24)
if days > 0 and hours > 0:
str_time = "{0:.0f} d, {1:.0f} h".format(days, hours)
elif days > 0:
str_time = "{0:.0f} d".format(days)
else:
str_time = "{0:.0f} h".format(hours)
return str_time
df["time"].astype("timedelta64[h]").map(silly_format)
which gets me the desired output but I was wondering whether there is a function in numpy
or pandas
similar to datetime.strftime
that can format numpy.timedelta64
according to some format string provided?
I tried to adapt @Jeff's solution further but it is way slower than my answer. Here it is:
days = time_delta.astype("timedelta64[D]").astype(int)
hours = time_delta.astype("timedelta64[h]").astype(int) % 24
result = days.astype(str)
mask = (days > 0) & (hours > 0)
result[mask] = days.astype(str) + ' d, ' + hours.astype(str) + ' h'
result[(hours > 0) & ~mask] = hours.astype(str) + ' h'
result[(days > 0) & ~mask] = days.astype(str) + ' d'
While the answers provided by @sebix and @Jeff show a nice way of converting the timedeltas to days and hours, and @Jeff's solution in particular retains the Series
' index, they lacked in flexibility of the final formatting of the string. The solution I'm using now is:
def delta_format(days, hours):
if days > 0 and hours > 0:
return "{0:.0f} d, {1:.0f} h".format(days, hours)
elif days > 0:
return "{0:.0f} d".format(days)
else:
return "{0:.0f} h".format(hours)
days = time_delta.astype("timedelta64[D]")
hours = time_delta.astype("timedelta64[h]") % 24
return [delta_format(d, h) for (d, h) in izip(days, hours)]
which suits me well and I get back the index by inserting that list into the original DataFrame
.
Here's how to do it in a vectorized manner.
In [28]: s = pd.to_timedelta(range(5),unit='d') + pd.offsets.Hour(3)
In [29]: s
Out[29]:
0 0 days, 03:00:00
1 1 days, 03:00:00
2 2 days, 03:00:00
3 3 days, 03:00:00
4 4 days, 03:00:00
dtype: timedelta64[ns]
In [30]: days = s.astype('timedelta64[D]').astype(int)
In [31]: hours = s.astype('timedelta64[h]').astype(int)-days*24
In [32]: days
Out[32]:
0 0
1 1
2 2
3 3
4 4
dtype: int64
In [33]: hours
Out[33]:
0 3
1 3
2 3
3 3
4 3
dtype: int64
In [34]: days.astype(str) + ' d, ' + hours.astype(str) + ' h'
Out[34]:
0 0 d, 3 h
1 1 d, 3 h
2 2 d, 3 h
3 3 d, 3 h
4 4 d, 3 h
dtype: object
If you want exactly as the OP posed:
In [4]: result = days.astype(str) + ' d, ' + hours.astype(str) + ' h'
In [5]: result[days==0] = hours.astype(str) + ' h'
In [6]: result
Out[6]:
0 3 h
1 1 d, 3 h
2 2 d, 3 h
3 3 d, 3 h
4 4 d, 3 h
dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With