Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to apply a function to two columns of Pandas dataframe

Suppose I have a df which has columns of 'ID', 'col_1', 'col_2'. And I define a function :

f = lambda x, y : my_function_expression.

Now I want to apply the f to df's two columns 'col_1', 'col_2' to element-wise calculate a new column 'col_3' , somewhat like :

df['col_3'] = df[['col_1','col_2']].apply(f)   # Pandas gives : TypeError: ('<lambda>() takes exactly 2 arguments (1 given)' 

How to do ?

** Add detail sample as below ***

import pandas as pd  df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]}) mylist = ['a','b','c','d','e','f']  def get_sublist(sta,end):     return mylist[sta:end+1]  #df['col_3'] = df[['col_1','col_2']].apply(get_sublist,axis=1) # expect above to output df as below     ID  col_1  col_2            col_3 0  1      0      1       ['a', 'b'] 1  2      2      4  ['c', 'd', 'e'] 2  3      3      5  ['d', 'e', 'f'] 
like image 971
bigbug Avatar asked Nov 11 '12 13:11

bigbug


People also ask

Can pandas apply return two columns?

Return Multiple Columns from pandas apply() You can return a Series from the apply() function that contains the new data. pass axis=1 to the apply() function which applies the function multiply to each row of the DataFrame, Returns a series of multiple columns from pandas apply() function.

How do you apply a function in a data frame?

DataFrame - apply() function. The apply() function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).


1 Answers

Here's an example using apply on the dataframe, which I am calling with axis = 1.

Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.

In [49]: df Out[49]:            0         1 0  1.000000  0.000000 1 -0.494375  0.570994 2  1.000000  0.000000 3  1.876360 -0.229738 4  1.000000  0.000000  In [50]: def f(x):        ....:  return x[0] + x[1]      ....:    In [51]: df.apply(f, axis=1) #passes a Series object, row-wise Out[51]:  0    1.000000 1    0.076619 2    1.000000 3    1.646622 4    1.000000 

Depending on your use case, it is sometimes helpful to create a pandas group object, and then use apply on the group.

like image 192
Aman Avatar answered Sep 30 '22 03:09

Aman