I've got two large df's from two xlsx spreadsheets and would like to merge them 'on' time stamp ['Time'] data.
The problem is that one data set has recorded time to decimal seconds and the other to whole seconds. This means that the two data set ['Time'] keys never match up...
df1 Time df2 Time
0 00:07:53.755000 0 00:07:53
1 00:07:54.096000 1 00:07:54
2 00:07:55.097000 2 00:07:55
3 00:07:56.099000 3 00:07:56
4 00:07:57.002000 4 00:07:57
5 00:07:58.012000 5 00:07:58
I've tried modifying time formats in Excel but it always retains the millisecond value because of its fractional 24 hour thing. I need to remove the decimal seconds from df1 (or just get them to match somehow!) to allow matching with row data from df2 and am hoping there is a much simpler way to do this in python?
Thanks to any and all advice!
In df1
, you can just set microseconds to 0:
df1['Time'] = pd.to_datetime(df1['Time']).apply(lambda x: x.replace(microsecond=0))
Then perform your merge as normal.
I would use pandas strftime
df1['Time'] = pd.to_datetime(df1['Time']).dt.strftime('%H:%M:%S')
df2['Time'] = pd.to_datetime(df2['Time']).dt.strftime('%H:%M:%S')
merged = pd.merge(df1, df2, on='Time')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With