I have the following dataset
OPEN TIME CLOSE TIME
0 09:44:00 10:07:00
1 10:07:00 11:01:00
2 11:05:00 13:05:00
But here the timestamps are in string format, how can I convert them to time format?
to_datetime
df['Open'] = pd.to_datetime(df['OPEN TIME'],format= '%H:%M:%S' ).dt.time
df['Close'] = pd.to_datetime(df['CLOSE TIME'],format= '%H:%M:%S' ).dt.time
It's possible to convert both columns in a one-liner using apply. Try:
df = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(pd.to_datetime, format='%H:%M:%S'))
To get the times without dates, use the following:
# assign back to the columns ---- sometimes, this case throws a SettingWithCopyWarning if `df` was filtered from another frame
df[['OPEN TIME', 'CLOSE TIME']] = df[['OPEN TIME', 'CLOSE TIME']].apply(lambda x: pd.to_datetime(x, format='%H:%M:%S').dt.time)
# or call assign and create a new dataframe copy ---- this case never throws a warning
df = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(lambda x: pd.to_datetime(x, format='%H:%M:%S').dt.time))
This converts each string into datetime.time objects. However, because datetime.time doesn't have a corresponding pandas dtype, it's difficult to leverage vectorized operations. For example, it's not possible to find time difference between OPEN TIME and CLOSE TIME as datetime.time objects (so there's not much improvement from strings) but if they were datetime64, it's possible. For example, the following creates datetime64:
df1 = df.assign(**df[['OPEN TIME', 'CLOSE TIME']].apply(pd.to_datetime, format='%H:%M:%S'))
df1['CLOSE TIME'] - df1['OPEN TIME']
0 0 days 00:23:00
1 0 days 00:54:00
2 0 days 02:00:00
dtype: timedelta64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With