Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply custom function over multiple columns in pandas

I am having trouble "applying" a custom function in Pandas. When I test the function, directly passing the values it works and correctly returns the response. However, when I attempt to pass the column values this way

def feez (rides, plan):
    pmt4       = 200
    inc4       = 50  #number rides included
    min_rate4  = 4 

    if plan == "4 Plan":
        if rides > inc4:
            fee = ((rides - inc4) * min_rate4) + pmt4 
        else:
            fee = pmt4
        return (fee)
    else:
       return 0.1

df['fee'].apply(feez(df.total_rides, df.plan_name))

I receive the error:

"The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

Passing the values directly works, i.e. feez (800, "4 Plan"), returns 3200

However, I receive errors when I try to apply the function above.

I am a newbie and suspect my syntax is poorly written. Any ideas much appreciated. TIA. Eli

like image 642
eli Avatar asked Nov 18 '17 23:11

eli


1 Answers

apply is meant to work on one row at a time, so passing the entire column as you are doing so will not work. In these instances, it's best to use a lambda.

df['fee'] = df.apply(lambda x: feez(x['total_rides'], x['plan_name']), axis=1)

However, there are possibly faster ways to do this. One way is using np.vectorize. The other is using np.where.

Option 1
np.vectorize

v = np.vectorize(feez)
df['fee'] = v(df.total_rides, df.plan_name)

Option 2
Nested np.where

df['fee'] = np.where(
        df.plan_name == "4 Plan", 
        np.where(df.total_rides > inc4, (df.total_rides - inc4) * min_rate4) + pmt4, pmt4), 
        0.1
)
like image 155
cs95 Avatar answered Sep 21 '22 12:09

cs95