Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: returning more then one field value when applying function to a data frame row

I need to apply several functions to data frame rows. Arguments of these functions take values from two or more fields of a single row. For example:

d = {                                                                                                       
  'a': [1,1,1,1],                                                                                           
  'b': [2,2,2,2],                                                                                           
  'c': [3,3,3,3],                                                                                           
  'd': [4,4,4,4]                                                                                            
}                                                                                                           

df1 = pd.DataFrame(d)                                                                                       

def f1(x,y):                                                                                                
    return x + 2*y                                                                                          

def f2(x,y):                                                                                                
    return y + 2*x                                                                                          

df2 = pd.DataFrame()                                                                                        
df2['val1'] = df1.apply(lambda r: f1(r.a, r.b),1)                                                           
df2['val2'] = df1.apply(lambda r: f2(r.c, r.d),1)                                                           

When applying each function in turn, one after another, Pandas make a separate iteration over all data frame rows. In this example Pandas iterate data frame twice. As a result I get:

In [10]: df2                                                                                                
Out[10]:                                                                                                    
   val1  val2                                                                                               
0     5    10                                                                                               
1     5    10                                                                                               
2     5    10                                                                                               
3     5    10                                                                                               

Is there any way to apply two or more functions like this in a single pass over data frame? This way application should return value for more then one field in a row. Also, this case includes application of a single function returning values for more then one field. How can this be done?

like image 462
zork Avatar asked Feb 20 '26 00:02

zork


1 Answers

You could fill them at the same time by combining your functions:

def f3(x,y,z,a):
    return x + 2*y, a + 2*z
df3 = pd.DataFrame()
df3['val1'], df3['val2'] = f3(df1.a, df1.b, df1.c, df1.d)
like image 118
r3robertson Avatar answered Feb 21 '26 14:02

r3robertson