I have a dataframe with 3 columns, one for hour, one for minute, and one for second, like this:
df = pd.DataFrame({'hour': [9.0, 9.0, 9.0, 10.0],
'min': [12.0, 13.0, 55.0, 2.0],
'sec': [42.0, 30.0, 12.0, 5.0]})
>>> df
hour min sec
0 9.0 12.0 42.0
1 9.0 13.0 30.0
2 9.0 55.0 12.0
3 10.0 2.0 5.0
I'm trying to combine the three columns into a new column made up of a datetime series. The goal would be to have this dataframe:
hour min sec time
0 9.0 12.0 42.0 9:12:42
1 9.0 13.0 30.0 9:13:30
2 9.0 55.0 12.0 9:55:12
3 10.0 2.0 5.0 10:02:05
So far I'm trying to use pd.to_datetime, as such:
df['time'] = pd.to_datetime(df[['hour', 'min', 'sec']],
format = '%H:%M:S')
But I'm getting the following ValueError:
ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing.
I was trying to avoid this by including the format argument with only hour minute second, but apparently that doesn't work.
A similar question was asked here, but the solutions proposed do not seem to work in this case, I'm still getting this ValueError
Any ideas to solve this would be appreciated!
Thanks!
[EDIT]: I also needed the method to be able to deal with NaNs, so a dataframe such as this:
df = pd.DataFrame({'hour': [9.0, 9.0, 9.0, 10.0, np.nan],
'min': [12.0, 13.0, 55.0, 2.0, np.nan],
'sec': [42.0, 30.0, 12.0, 5.0, np.nan]})
The solution proposed by @PiRSquared works
Not sure if there is a more direct way but this works
df['time'] = pd.to_datetime(df['hour'].astype(int).astype(str)+':'+df['min'].astype(int).astype(str)+':'+df['sec'].astype(int).astype(str), format = '%H:%M:%S').dt.time
hour min sec time
0 9.0 12.0 42.0 09:12:42
1 9.0 13.0 30.0 09:13:30
2 9.0 55.0 12.0 09:55:12
3 10.0 2.0 5.0 10:02:05
We can use pd.to_datetime on a dataframe with the requisite column names to create a series of datetimes.
However, OPs initial dataframe has a 'min' column that needs to be renamed 'minute' and a 'sec' column that needs to be renamed 'second'.
In addition, I'll add the missing columns 'year', 'month', and 'day' using pd.DataFrame.assign.
Finally, I'll add the 'time' column with pd.DataFrame.assign again.
new = dict(year=2017, month=1, day=1)
rnm = dict(min='minute', sec='second')
df.assign(
time=pd.to_datetime(
df.rename(columns=rnm).assign(**new)
).dt.time
)
hour min sec time
0 9.0 12.0 42.0 09:12:42
1 9.0 13.0 30.0 09:13:30
2 9.0 55.0 12.0 09:55:12
3 10.0 2.0 5.0 10:02:05
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With