I have a data frame like this
----------------
RecID| A  |B
----------------
1    |NaN | x 
2    |y   | NaN 
3    |z   | NaN
4    |NaN | a 
5    |NaN | b 
And I want to create a new column, C, from A and B such that if A is null then fill with B and if B is null then fill with A:
----------------------
RecID|A   |B    |C 
----------------------
1    |NaN | x   |x
2    |y   | NaN |y 
3    |z   | NaN |z
4    |NaN | a   |a
5    |NaN | b   |b
Lastly, is there an efficient way to do this if I have more than two columns, e.g. I have columns A-Z and want create a new column A1 out of columns A-Z similar to above?
pandaslookup
This is the generalizable solution OP was looking for and will work across an arbitrary number of columns.
lookup = df.loc[:, 'A':'B'].notnull().idxmax(1)
df.assign(A1=df.lookup(lookup.index, lookup.values))
   RecID    A    B A1
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b
fillna  
df.assign(C=df.A.fillna(df.B))
   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b
mask  
df.assign(C=df.A.mask(df.A.isnull(), df.B))
   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b
combine_first  
df.assign(C=df.A.combine_first(df.B))
   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b
numpynp.where  
df.assign(C=np.where(df.A.notnull(), df.A, df.B))
   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With