I have a data frame like this
----------------
RecID| A |B
----------------
1 |NaN | x
2 |y | NaN
3 |z | NaN
4 |NaN | a
5 |NaN | b
And I want to create a new column, C, from A and B such that if A is null then fill with B and if B is null then fill with A:
----------------------
RecID|A |B |C
----------------------
1 |NaN | x |x
2 |y | NaN |y
3 |z | NaN |z
4 |NaN | a |a
5 |NaN | b |b
Lastly, is there an efficient way to do this if I have more than two columns, e.g. I have columns A-Z and want create a new column A1 out of columns A-Z similar to above?
pandas
lookup
This is the generalizable solution OP was looking for and will work across an arbitrary number of columns.
lookup = df.loc[:, 'A':'B'].notnull().idxmax(1)
df.assign(A1=df.lookup(lookup.index, lookup.values))
RecID A B A1
0 1 NaN x x
1 2 y NaN y
2 3 z NaN z
3 4 NaN a a
4 5 NaN b b
fillna
df.assign(C=df.A.fillna(df.B))
RecID A B C
0 1 NaN x x
1 2 y NaN y
2 3 z NaN z
3 4 NaN a a
4 5 NaN b b
mask
df.assign(C=df.A.mask(df.A.isnull(), df.B))
RecID A B C
0 1 NaN x x
1 2 y NaN y
2 3 z NaN z
3 4 NaN a a
4 5 NaN b b
combine_first
df.assign(C=df.A.combine_first(df.B))
RecID A B C
0 1 NaN x x
1 2 y NaN y
2 3 z NaN z
3 4 NaN a a
4 5 NaN b b
numpy
np.where
df.assign(C=np.where(df.A.notnull(), df.A, df.B))
RecID A B C
0 1 NaN x x
1 2 y NaN y
2 3 z NaN z
3 4 NaN a a
4 5 NaN b b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With