I have a Pandas dataframe like this; (obtained by parsing an excel file)
| | COMPANY NAME | MEETING DATE | MEETING TIME|
-----------------------------------------------------------------------|
|YKSGR| YAPI KREDİ SİGORTA A.Ş. | 2013-12-16 00:00:00 |14:00:00 |
|TRCAS| TURCAS PETROL A.Ş. | 2013-12-12 00:00:00 |13:30:00 |
Column MEETING DATE
is a timestamp with a representation like Timestamp('2013-12-20 00:00:00', tz=None)
and MEETING TIME
is a datetime.time
object with a representation like datetime.time(14, 0)
I want to combine MEETING DATE
and MEETING TIME
into one column. datetime.combine seems to do what I want, however, I need to apply this function column-wise somehow. How can I achieve this?
Combine date and time with formula in Excel. There is a very simple formula that can quickly help you combine date column and time column into one. Type this formula =TEXT(A2,"m/dd/yy ")&TEXT(B2,"hh:mm:ss") (A2 indicates the first data in date column, B2 stands the first data in time column, you can change them as you need) into a blank cell,...
Right click the column header and select Change Type --> Date/Time. The new column will be converted into a Date/Time column. You can combine the two columns using the Query Editor.
You can combine date and time from different MySQL columns to compare with the entire date time with the help of CONCAT () function. The syntax is as follows − To understand the above syntax, let us create a table.
I wasn't too happy with the text formatting workaround, but found a much easier way to do it directly in the PQ UI. Simply select both the date and time column, then use PQ menu Transform > Date > Combine Date and Time . Voila, it adds a new combined Date-Time column (here called 'Merged') with the following query step:
You can use apply method, and apply combine like this:
>>> df.apply(lambda x: combine(x['MEETING DATE'], x['MEETING TIME']), axis=1)
0 2013-12-16 14:00:00
1 2013-12-12 13:00:00
Other solutions didn't work for me, so I came up with a workaround using replace
instead of combine
:
def combine_date_time(df, datecol, timecol):
return df.apply(lambda row: row[datecol].replace(
hour=row[timecol].hour,
minute=row[timecol].minute),
axis=1
)
In your case:
combine_date_time(df, 'MEETING DATE', 'MEETING TIME')
It feels slow (I haven't timed it properly), but it works.
UPDATE: I have timed both approaches for a relatively large dataset (>500.000 rows), and they both have similar run times, but using combine
is faster (59s for replace
vs 50s for combine
). Also, see jezrael answer on this.
UPDATE2: I have tried jezrael's approach:
def combine_date_time(df, datecol, timecol):
return pd.to_datetime(df[datecol].dt.date.astype(str)
+ ' '
+ df[timecol].astype(str))
This approach is blazing fast in comparison, jezrael is right. I haven't been able to measure it though, but it is evident.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With