I have a Pandas dataframe like this; (obtained by parsing an excel file) <pre class="prettyprint"><code>| | COMPANY NAME | MEETING DATE | MEETING TIME| -----------------------------------------------------------------------| |YKSGR| YAPI KREDİ SİGORTA A.Ş. | 2013-12-16 00:00:00 |14:00:00 | |TRCAS| TURCAS PETROL A.Ş. | 2013-12-12 00:00:00 |13:30:00 | </code></pre> Column <code>MEETING DATE</code> is a timestamp with a representation like <code>Timestamp('2013-12-20 00:00:00', tz=None)</code> and <code>MEETING TIME</code> is a <code>datetime.time</code> object with a representation like <code>datetime.time(14, 0)</code> I want to combine <code>MEETING DATE</code> and <code>MEETING TIME</code> into one column. datetime.combine seems to do what I want, however, I need to apply this function column-wise somehow. How can I achieve this?

You can use apply method, and apply combine like this: <pre class="prettyprint"><code>>>> df.apply(lambda x: combine(x['MEETING DATE'], x['MEETING TIME']), axis=1) 0 2013-12-16 14:00:00 1 2013-12-12 13:00:00 </code></pre>

Combine date column and time column into datetime column

Tags:

python

datetime

pandas

data-analysis

I have a Pandas dataframe like this; (obtained by parsing an excel file)

|     |     COMPANY NAME           | MEETING DATE        | MEETING TIME|
-----------------------------------------------------------------------|
|YKSGR|    YAPI KREDİ SİGORTA A.Ş. | 2013-12-16 00:00:00 |14:00:00     |
|TRCAS|    TURCAS PETROL A.Ş.      | 2013-12-12 00:00:00 |13:30:00     |

Column MEETING DATE is a timestamp with a representation like Timestamp('2013-12-20 00:00:00', tz=None) and MEETING TIME is a datetime.time object with a representation like datetime.time(14, 0)

I want to combine MEETING DATE and MEETING TIME into one column. datetime.combine seems to do what I want, however, I need to apply this function column-wise somehow. How can I achieve this?

316

asked Nov 15 '13 19:11

yasar

2 Answers

You can use apply method, and apply combine like this:

>>> df.apply(lambda x: combine(x['MEETING DATE'], x['MEETING TIME']), axis=1)
0   2013-12-16 14:00:00
1   2013-12-12 13:00:00

answered Sep 23 '22 18:09

Roman Pekar

Other solutions didn't work for me, so I came up with a workaround using replace instead of combine:

def combine_date_time(df, datecol, timecol):
   return df.apply(lambda row: row[datecol].replace(
      hour=row[timecol].hour,
      minute=row[timecol].minute),
      axis=1
   )

In your case:

combine_date_time(df, 'MEETING DATE', 'MEETING TIME')

It feels slow (I haven't timed it properly), but it works.

UPDATE: I have timed both approaches for a relatively large dataset (>500.000 rows), and they both have similar run times, but using combine is faster (59s for replace vs 50s for combine). Also, see jezrael answer on this.

UPDATE2: I have tried jezrael's approach:

def combine_date_time(df, datecol, timecol):
    return pd.to_datetime(df[datecol].dt.date.astype(str)
                          + ' '
                          + df[timecol].astype(str))

This approach is blazing fast in comparison, jezrael is right. I haven't been able to measure it though, but it is evident.

answered Sep 22 '22 18:09

jabellcu

Related questions
                            
                                Composition - Reference to another class in Python
                            
                                ValueError: too many values to unpack in Python Dictionary [duplicate]
                            
                                Is python @decorator related to the decorator design pattern?
                            
                                Convert integer to binary in python and compare the bits
                            
                                How to use super() when subclassing Tkinter widgets? [duplicate]
                            
                                Cannot convert array to floats python
                            
                                Matplotlib: using a figure object to initialize a plot
                            
                                basemap: How to remove actual lat/lon lines while keeping the ticks on the axis
                            
                                how to compute a new column based on the values of other columns in pandas - python
                            
                                Reducing memory used by a large dict
                            
                                Python escape character
                            
                                Opencv draws numpy.zeros as a gray image
                            
                                Python function is changing the value of my input, and I can't figure out why
                            
                                Shebang executable not found because of UTF-8 BOM (Byte Order Mark)
                            
                                Kivy - base application has strange alignment
                            
                                My code nests too deep. Is there a better way?
                            
                                Word boundary to use in unicode text for Python regex
                            
                                How to generate random numbers in specyfic range using pareto distribution in Python
                            
                                Python Django send_mail newlines?
                            
                                Passing a value to WTForms field with Jinja2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With