Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

applying lambda row on multiple columns pandas

I am creating a sample dataframe:

tp = pd.DataFrame({'source':['a','s','f'], 
                   'target':['b','n','m'], 
                   'count':[0,8,4]})

And creating a column 'col' based on condition of 'target' column >> same as source, if matching condition, else to a default, as below:

tp['col'] = tp.apply(lambda row:row['source'] if row['target'] in ['b','n'] else 'x')

But it's throwing me this error: KeyError: ('target', 'occurred at index count')

How can I make it work, without defining a function?

like image 534
muni Avatar asked Jun 28 '18 10:06

muni


People also ask

Can a lambda function takes more than one column?

square() and Lambda Function. Apply a lambda function to multiple columns in DataFrame using Dataframe apply(), lambda, and Numpy functions.

How do you apply a function to all rows of a column in pandas?

In order to apply a function to every row, you should use axis=1 param to apply(), default it uses axis=0 meaning it applies a function to each column. By applying a function to each row, we can create a new column by using the values from the row, updating the row e.t.c.

How do I apply a lambda function to a column in pandas?

We can do this with the apply() function in Pandas. We can use the apply() function to apply the lambda function to both rows and columns of a dataframe. If the axis argument in the apply() function is 0, then the lambda function gets applied to each column, and if 1, then the function gets applied to each row.


1 Answers

You need to use axis=1 to tell Pandas you want to apply a function to each row. The default is axis=0.

tp['col'] = tp.apply(lambda row: row['source'] if row['target'] in ['b', 'n'] else 'x',
                     axis=1)

However, for this specific task, you should use vectorised operations. For example, using numpy.where:

tp['col'] = np.where(tp['target'].isin(['b', 'n']), tp['source'], 'x')

pd.Series.isin returns a Boolean series which tells numpy.where whether to select the second or third argument.

like image 156
jpp Avatar answered Sep 18 '22 13:09

jpp