How can I have conditional assignment in pandas by based on the values of two columns? Conceptually something like the following:
Column_D = Column_B / (Column_B + Column_C) if Column_C is not null else Column_C
Concrete example:
import pandas as pd
import numpy as np
df = pd.DataFrame({'b': [2,np.nan,4,2,np.nan], 'c':[np.nan,1,2,np.nan,np.nan]})
b c
0 2.0 NaN
1 NaN 1.0
2 4.0 2.0
3 2.0 NaN
4 NaN NaN
I want to have a new column d
whose result is division of column b
by sum of b
and c
, if c
is not null, otherwise the value should be the value at column c
.
Something conceptually like the following:
df['d'] = df['b']/(df['b']+df['c']) if not df['c'].isnull() else df['c']
desired result:
b c d
0 2.0 NaN NaN
1 NaN 1.0 1.0
2 4.0 2.0 0.66
3 2.0 NaN NaN
4 NaN NaN NaN
How can I achieve this?
You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.
We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.
try this (if you want to have your desired result set - checking b
column):
In [30]: df['d'] = np.where(df.b.notnull(), df.b/(df.b+df.c), df.c)
In [31]: df
Out[31]:
b c d
0 2.0 NaN NaN
1 NaN 1.0 1.000000
2 4.0 2.0 0.666667
3 2.0 NaN NaN
4 NaN NaN NaN
or this, checking c
column:
In [32]: df['d'] = np.where(df.c.notnull(), df.b/(df.b+df.c), df.c)
In [33]: df
Out[33]:
b c d
0 2.0 NaN NaN
1 NaN 1.0 NaN
2 4.0 2.0 0.666667
3 2.0 NaN NaN
4 NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With