Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas- apply function with two arguments to columns

Can you make a python pandas function with values in two different columns as arguments?

I have a function that returns a 1 if two columns have values in the same range. otherwise it returns 0:

def segmentMatch(RealTime, ResponseTime):     if RealTime <= 566 and ResponseTime <= 566:         matchVar = 1     elif 566 < RealTime <= 1132 and 566 < ResponseTime <= 1132:         matchVar = 1     elif 1132 < RealTime <= 1698 and 1132 < ResponseTime <= 1698:         matchVar = 1     else:         matchVar = 0     return matchVar 

I want the first argument, RealTime, to be a column in my data frame, such that the function will take the value of each row in that column. e.g. RealTime is df['TimeCol'] and the second argument is df['ResponseCol']`. And I'd like the result to be a new column in the dataframe. I came across several threads that have answered a similar question, but it looks like those arguments were variables, not values in rows of the dataframe.

I tried the following but it didn't work:

df['NewCol'] = df.apply(segmentMatch, args=(df['TimeCol'], df['ResponseCol']), axis=1) 
like image 406
Maria Avatar asked Dec 15 '15 01:12

Maria


People also ask

How do I apply a function on two Pandas columns?

5. Pandas Apply Function to Multiple List of Columns. Similarly using apply() method, you can apply a function on a selected multiple list of columns. In this case, the function will apply to only selected two columns without touching the rest of the columns.

Can I apply a function to a column in Pandas?

Pandas with PythonWe can use apply() function on a column of a DataFrame with lambda expression.

Can Pandas apply return two columns?

Return Multiple Columns from pandas apply() You can return a Series from the apply() function that contains the new data. pass axis=1 to the apply() function which applies the function multiply to each row of the DataFrame, Returns a series of multiple columns from pandas apply() function.


2 Answers

Why not just do this?

df['NewCol'] = df.apply(lambda x: segmentMatch(x['TimeCol'], x['ResponseCol']), axis=1) 

Rather than trying to pass the column as an argument as in your example, we now simply pass the appropriate entries in each row as argument, and store the result in 'NewCol'.

like image 175
Niels Wouda Avatar answered Oct 12 '22 15:10

Niels Wouda


You don't really need a lambda function if you are defining the function outside:

def segmentMatch(vec):     RealTime = vec[0]     ResponseTime = vec[1]     if RealTime <= 566 and ResponseTime <= 566:         matchVar = 1     elif 566 < RealTime <= 1132 and 566 < ResponseTime <= 1132:         matchVar = 1     elif 1132 < RealTime <= 1698 and 1132 < ResponseTime <= 1698:         matchVar = 1     else:         matchVar = 0     return matchVar  df['NewCol'] = df[['TimeCol', 'ResponseCol']].apply(segmentMatch, axis=1) 

If "segmentMatch" were to return a vector of 2 values instead, you could do the following:

def segmentMatch(vec):     ......     return pd.Series((matchVar1, matchVar2))   df[['NewCol', 'NewCol2']] = df[['TimeCol','ResponseCol']].apply(segmentMatch, axis=1) 
like image 31
rahul Avatar answered Oct 12 '22 17:10

rahul