I have a Dataframe df like this:
A B C D
2 1 O s h
4 2 P
7 3 Q
9 4 R h m
I have a function f to calculate C and D based on B for a row:
def f(p): #p is the value of column B for a row.
return p+'k', p+'n'
How can I populate the missing values for row 4&7 by applying the function f to the Dataframe?
The expected outcome is like below:
A B C D
2 1 O s h
4 2 P Pk Pn
7 3 Q Qk Qn
9 4 R h m
The function f has to be used as the real function is very complicated. Also, the function only needs to be applied to the rows missing C and D
The easiest way to apply a formula to the entire column in all adjacent cells is by double-clicking the fill handle by selecting the formula cell. In this example, we need to select the cell F2 and double click on the bottom right corner. Excel applies the same formula to all the adjacent cells in the entire column F.
To replace multiple values in a DataFrame, you can use DataFrame. replace() method with a dictionary of different replacements passed as argument.
DataFrame - apply() function. The apply() function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).
Maybe there is a more elegant way, but I would do in this way:
df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])
Applying the function to the columns and get the first and the second value of the outputs. It returns:
A B C D
0 1 O Ok On
1 2 P Pk Pn
2 3 Q Qk Qn
3 4 R Rk Rn
EDIT:
In a more concise way, thanks to this answer:
df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With