Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trouble passing in lambda to apply for pandas DataFrame

I'm trying to apply a function to all rows of a pandas DataFrame (actually just one column in that DataFrame)

I'm sure this is a syntax error but I'm know sure what I'm doing wrong

df['col'].apply(lambda x, y:(x - y).total_seconds(), args=[d1], axis=1) 

The col column contains a bunch a datetime.datetime objects and and d1 is the earliest of them. I'm trying to get a column of the total number of seconds for each of the rows

EDIT I keep getting the following error

TypeError: <lambda>() got an unexpected keyword argument 'axis' 

I don't understand why axis is getting passed to my lambda function

EDIT 2

I've also tried doing

def diff_dates(d1, d2):     return (d1-d2).total_seconds()  df['col'].apply(diff_dates, args=[d1], axis=1) 

And I get the same error

like image 917
sedavidw Avatar asked Mar 19 '15 21:03

sedavidw


People also ask

How do you apply a lambda function to a DataFrame column in Python?

You can apply the lambda function for a single column in the DataFrame. The following example subtracts every cell value by 2 for column A – df["A"]=df["A"]. apply(lambda x:x-2) .

How do you use lambda in Apply?

We can use the apply() function to apply the lambda function to both rows and columns of a dataframe. If the axis argument in the apply() function is 0, then the lambda function gets applied to each column, and if 1, then the function gets applied to each row.

Is pandas apply in place?

No, the apply() method doesn't contain an inplace parameter, unlike these pandas methods which have an inplace parameter: df. drop()


1 Answers

Note there is no axis param for a Series.apply call, as distinct to a DataFrame.apply call.

Series.apply(func, convert_dtype=True, args=(), **kwds)

func : function convert_dtype : boolean, default True Try to find better dtype for elementwise function results. If False, leave as dtype=object args : tuple Positional arguments to pass to function in addition to the value 

There is one for a df but it's unclear how you're expecting this to work when you're calling it on a series but you're expecting it to work on a row?

like image 166
EdChum Avatar answered Oct 07 '22 13:10

EdChum