What is the rule/process when a function is called with pandas apply()
through lambda vs. not? Examples below. Without lambda apparently, the entire series ( df[column name] ) is passed to the "test" function which throws an error trying to do a boolean operation on a series.
If the same function is called via lambda it works. Iteration over each row with each passed as "x" and the df[ column name ] returns a single value for that column in the current row.
It's like lambda is removing a dimension. Anyone have an explanation or point to the specific doc on this? Thanks.
Example 1 with lambda, works OK
print("probPredDF columns:", probPredDF.columns)
def test( x, y):
if x==y:
r = 'equal'
else:
r = 'not equal'
return r
probPredDF.apply( lambda x: test( x['yTest'], x[ 'yPred']), axis=1 ).head()
Example 1 output
probPredDF columns: Index([0, 1, 'yPred', 'yTest'], dtype='object')
Out[215]:
0 equal
1 equal
2 equal
3 equal
4 equal
dtype: object
Example 2 without lambda, throws boolean operation on series error
print("probPredDF columns:", probPredDF.columns)
def test( x, y):
if x==y:
r = 'equal'
else:
r = 'not equal'
return r
probPredDF.apply( test( probPredDF['yTest'], probPredDF[ 'yPred']), axis=1 ).head()
Example 2 output
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
pandas. DataFrame. apply() can be used with python lambda to execute expression. A lambda function in python is a small anonymous function that can take any number of arguments and execute an expression.
apply() method. This function acts as a map() function in Python. It takes a function as an input and applies this function to an entire DataFrame. If you are working with tabular data, you must specify an axis you want your function to act on ( 0 for columns; and 1 for rows).
Pandas DataFrame apply() Method The apply() method allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.
There is nothing magic about a lambda
. They are functions in one parameter, that can be defined inline, and do not have a name. You can use a function where a lambda is expected, but the function will need to also take one parameter. You need to do something like...
Define it as:
def wrapper(x):
return test(x['yTest'], x['yPred'])
Use it as:
probPredDF.apply(wrapper, axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With