Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe: How to update multiple columns by applying a function?

Tags:

python

pandas

I have a Dataframe df like this:

   A   B   C    D
2  1   O   s    h
4  2   P    
7  3   Q
9  4   R   h    m

I have a function f to calculate C and D based on B for a row:

def f(p): #p is the value of column B for a row. 
     return p+'k', p+'n'

How can I populate the missing values for row 4&7 by applying the function f to the Dataframe?

The expected outcome is like below:

   A   B   C    D
2  1   O   s    h
4  2   P   Pk   Pn
7  3   Q   Qk   Qn
9  4   R   h    m

The function f has to be used as the real function is very complicated. Also, the function only needs to be applied to the rows missing C and D

like image 458
John Smith Avatar asked Sep 16 '15 08:09

John Smith


People also ask

How do I apply a function to all columns?

The easiest way to apply a formula to the entire column in all adjacent cells is by double-clicking the fill handle by selecting the formula cell. In this example, we need to select the cell F2 and double click on the bottom right corner. Excel applies the same formula to all the adjacent cells in the entire column F.

How do I replace multiple columns in pandas?

To replace multiple values in a DataFrame, you can use DataFrame. replace() method with a dictionary of different replacements passed as argument.

How do you apply a function in a data frame?

DataFrame - apply() function. The apply() function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).


1 Answers

Maybe there is a more elegant way, but I would do in this way:

df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])

Applying the function to the columns and get the first and the second value of the outputs. It returns:

   A  B   C   D
0  1  O  Ok  On
1  2  P  Pk  Pn
2  3  Q  Qk  Qn
3  4  R  Rk  Rn

EDIT:

In a more concise way, thanks to this answer:

df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))
like image 167
Fabio Lamanna Avatar answered Oct 21 '22 07:10

Fabio Lamanna