Add Multiple Columns to Pandas Dataframe from Function

Tags:

pandas

I have a pandas data frame mydf that has two columns,and both columns are datetime datatypes: mydate and mytime. I want to add three more columns: hour, weekday, and weeknum.

def getH(t): #gives the hour     return t.hour def getW(d): #gives the week number     return d.isocalendar()[1]  def getD(d): #gives the weekday     return d.weekday() # 0 for Monday, 6 for Sunday  mydf["hour"] = mydf.apply(lambda row:getH(row["mytime"]), axis=1) mydf["weekday"] = mydf.apply(lambda row:getD(row["mydate"]), axis=1) mydf["weeknum"] = mydf.apply(lambda row:getW(row["mydate"]), axis=1)

The snippet works, but it's not computationally efficient as it loops through the data frame at least three times. I would just like to know if there's a faster and/or more optimal way to do this. For example, using zip or merge? If, for example, I just create one function that returns three elements, how should I implement this? To illustrate, the function would be:

def getHWd(d,t):     return t.hour, d.isocalendar()[1], d.weekday()

491

asked May 04 '15 09:05

EFL

2 Answers

Here's on approach to do it using one apply

Say, df is like

In [64]: df Out[64]:        mydate     mytime 0  2011-01-01 2011-11-14 1  2011-01-02 2011-11-15 2  2011-01-03 2011-11-16 3  2011-01-04 2011-11-17 4  2011-01-05 2011-11-18 5  2011-01-06 2011-11-19 6  2011-01-07 2011-11-20 7  2011-01-08 2011-11-21 8  2011-01-09 2011-11-22 9  2011-01-10 2011-11-23 10 2011-01-11 2011-11-24 11 2011-01-12 2011-11-25

We'll take the lambda function out to separate line for readability and define it like

In [65]: lambdafunc = lambda x: pd.Series([x['mytime'].hour,                                            x['mydate'].isocalendar()[1],                                            x['mydate'].weekday()])

And, apply and store the result to df[['hour', 'weekday', 'weeknum']]

In [66]: df[['hour', 'weekday', 'weeknum']] = df.apply(lambdafunc, axis=1)

And, the output is like

In [67]: df Out[67]:        mydate     mytime  hour  weekday  weeknum 0  2011-01-01 2011-11-14     0       52        5 1  2011-01-02 2011-11-15     0       52        6 2  2011-01-03 2011-11-16     0        1        0 3  2011-01-04 2011-11-17     0        1        1 4  2011-01-05 2011-11-18     0        1        2 5  2011-01-06 2011-11-19     0        1        3 6  2011-01-07 2011-11-20     0        1        4 7  2011-01-08 2011-11-21     0        1        5 8  2011-01-09 2011-11-22     0        1        6 9  2011-01-10 2011-11-23     0        2        0 10 2011-01-11 2011-11-24     0        2        1 11 2011-01-12 2011-11-25     0        2        2

165

answered Sep 18 '22 12:09

Zero

To complement John Galt's answer:

Depending on the task that is performed by lambdafunc, you may experience some speedup by storing the result of apply in a new DataFrame and then joining with the original:

lambdafunc = lambda x: pd.Series([x['mytime'].hour,                                   x['mydate'].isocalendar()[1],                                   x['mydate'].weekday()])  newcols = df.apply(lambdafunc, axis=1) newcols.columns = ['hour', 'weekday', 'weeknum'] newdf = df.join(newcols)

Even if you do not see a speed improvement, I would recommend using the join. You will be able to avoid the (always annoying) SettingWithCopyWarning that may pop up when assigning directly on the columns:

SettingWithCopyWarning:  A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

answered Sep 20 '22 12:09

Pedro M Duarte

Related questions
                            
                                How is `min` of two integers just as fast as 'bit hacking'?
                            
                                How do you run a python script from within notepad++? [duplicate]
                            
                                Tool for pinpointing circular imports in Python/Django?
                            
                                Django 1.7 - How do I suppress "(1_6.W001) Some project unittests may not execute as expected."?
                            
                                ProgrammingError: relation "django_session" does not exist error after installing Psycopg2
                            
                                How can I create a dropdown menu from a List in Tkinter?
                            
                                How to create a stacked bar chart for my DataFrame using seaborn [duplicate]
                            
                                Pandas query function not working with spaces in column names
                            
                                How to use dynamic foreignkey in Django?
                            
                                Handling \r\n vs \n newlines in python on Mac vs Windows
                            
                                Turn off caching of static files in Django development server
                            
                                How to install matplotlib with Python3.2
                            
                                sorting a counter in python by keys
                            
                                Insert a link inside a Pandas table
                            
                                Get the description of a status code in Python Requests
                            
                                Idiomatic way to do list/dict in Cython?
                            
                                How to store the result of an executed shell command in a variable in python? [duplicate]
                            
                                Difference between "findAll" and "find_all" in BeautifulSoup
                            
                                Python/PIL Resize all images in a folder
                            
                                Filter out rows based on list of strings in Pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With