Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas apply() with and without lambda

What is the rule/process when a function is called with pandas apply() through lambda vs. not? Examples below. Without lambda apparently, the entire series ( df[column name] ) is passed to the "test" function which throws an error trying to do a boolean operation on a series.

If the same function is called via lambda it works. Iteration over each row with each passed as "x" and the df[ column name ] returns a single value for that column in the current row.

It's like lambda is removing a dimension. Anyone have an explanation or point to the specific doc on this? Thanks.

Example 1 with lambda, works OK

print("probPredDF columns:", probPredDF.columns)

def test( x, y):
    if x==y:
        r = 'equal'
    else:
        r = 'not equal'
    return r    

probPredDF.apply( lambda x: test( x['yTest'], x[ 'yPred']), axis=1 ).head()

Example 1 output

probPredDF columns: Index([0, 1, 'yPred', 'yTest'], dtype='object')

Out[215]:
0    equal
1    equal
2    equal
3    equal
4    equal
dtype: object

Example 2 without lambda, throws boolean operation on series error

print("probPredDF columns:", probPredDF.columns)

def test( x, y):
    if x==y:
        r = 'equal'
    else:
        r = 'not equal'
    return r    

probPredDF.apply( test( probPredDF['yTest'], probPredDF[ 'yPred']), axis=1 ).head()

Example 2 output

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
like image 357
Tom Miron Avatar asked May 05 '17 16:05

Tom Miron


People also ask

What is lambda in pandas apply?

pandas. DataFrame. apply() can be used with python lambda to execute expression. A lambda function in python is a small anonymous function that can take any number of arguments and execute an expression.

What is apply () in Python?

apply() method. This function acts as a map() function in Python. It takes a function as an input and applies this function to an entire DataFrame. If you are working with tabular data, you must specify an axis you want your function to act on ( 0 for columns; and 1 for rows).

What is the purpose of apply () function in pandas?

Pandas DataFrame apply() Method The apply() method allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.


1 Answers

There is nothing magic about a lambda. They are functions in one parameter, that can be defined inline, and do not have a name. You can use a function where a lambda is expected, but the function will need to also take one parameter. You need to do something like...

Define it as:

def wrapper(x):
    return test(x['yTest'], x['yPred'])

Use it as:

probPredDF.apply(wrapper, axis=1)
like image 151
Stephen Rauch Avatar answered Sep 28 '22 11:09

Stephen Rauch