Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert timedelta to time of day in pandas?

I have a SQL table that contains data of the mySQL time type as follows:

time_of_day
-----------
   12:34:56

I then use pandas to read the table in:

df = pd.read_sql('select * from time_of_day', engine)

Looking at df.dtypes yields:

time_of_day timedelta64[ns]

My main issue is that, when writing my df to a csv file, the data comes out all messed up, instead of essentially looking like my SQL table:

time_of_day
0 days 12:34:56.000000000

I'd like to instead (obviously) store this record as a time, but I can't find anything in the pandas docs that talk about a time dtype.

Does pandas lack this functionality intentionally? Is there a way to solve my problem without requiring janky data casting?

Seems like this should be elementary, but I'm confounded.

like image 461
ryantuck Avatar asked Dec 28 '15 22:12

ryantuck


2 Answers

Pandas does not support a time dtype series

Pandas (and NumPy) do not have a time dtype. Since you wish to avoid Pandas timedelta, you have 3 options: Pandas datetime, Python datetime.time, or Python str. Below they are presented in order of preference. Let's assume you start with the following dataframe:

df = pd.DataFrame({'time': pd.to_timedelta(['12:34:56', '05:12:45', '15:15:06'])})

print(df['time'].dtype)  # timedelta64[ns]

Pandas datetime series

You can use Pandas datetime series and include an arbitrary date component, e.g. today's date. Underlying such a series are integers, which makes this solution the most efficient and adaptable.

The default date, if unspecified, is 1-Jan-1970:

df['time'] = pd.to_datetime(df['time'])

print(df)

#                  time
# 0 1970-01-01 12:34:56
# 1 1970-01-01 05:12:45
# 2 1970-01-01 15:15:06

You can also specify a date, such as today:

df['time'] = pd.Timestamp('today').normalize() + df['time']

print(df)

#                  time
# 0 2019-01-02 12:34:56
# 1 2019-01-02 05:12:45
# 2 2019-01-02 15:15:06

Pandas object series of Python datetime.time values

The Python datetime module from the standard library supports datetime.time objects. You can convert your series to an object dtype series containing pointers to a sequence of datetime.time objects. Operations will no longer be vectorised, but each underlying value will be represented internally by a number.

df['time'] = pd.to_datetime(df['time']).dt.time

print(df)

#        time
# 0  12:34:56
# 1  05:12:45
# 2  15:15:06

print(df['time'].dtype)
# object

print(type(df['time'].at[0]))
# <class 'datetime.time'>

Pandas object series of Python str values

Converting to strings is only recommended for presentation purposes that are not supported by other types, e.g. Pandas datetime or Python datetime.time. For example:

df['time'] = pd.to_datetime(df['time']).dt.strftime('%H:%M:%S')

print(df)

#        time
# 0  12:34:56
# 1  05:12:45
# 2  15:15:06

print(df['time'].dtype)
# object

print(type(df['time'].at[0]))
# <class 'str'>
like image 139
jpp Avatar answered Sep 29 '22 12:09

jpp


it's a hack, but you can pull out the components to create a string and convert that string to a datetime.time(h,m,s) object

def convert(td):
    time = [str(td.components.hours), str(td.components.minutes), 
    str(td.components.seconds)]
    return datetime.strptime(':'.join(time), '%H:%M:%S').time()

df['time'] = df['time'].apply(lambda x: convert(x))
like image 37
Infamouse Avatar answered Sep 29 '22 11:09

Infamouse