I have a pandas dataframe like the following:
Col1 Col2 Col3
0 A 7 NaN
1 B 16 NaN
1 B 16 15
What I want to do is to swap Col2 with Col3 where the value of Col3 is NaN
. Based on other posts and answers on SO, I have this code so far:
df[['Col2', 'Col3']] = df[['Col3', 'Col2']].where(df[['Col3']].isnull())
But this does not seem to be working properly and gives me the following:
Col1 Col2 Col3
0 A NaN NaN
1 B NaN NaN
1 B NaN NaN
Is there something that I might be missing here?
Update: My desired output looks like:
Col1 Col2 Col3
0 A NaN 7
1 B NaN 16
1 B 16 15
Thanks
You can use loc
to do the swap:
df.loc[df['Col3'].isnull(), ['Col2', 'Col3']] = df.loc[df['Col3'].isnull(), ['Col3', 'Col2']].values
Note that .values
is required to make sure the swap is done properly, otherwise Pandas would try to align based on index and column names, and no swap would occur.
You can also just reassign each row individually, if you feel the code is cleaner:
null_idx = df['Col3'].isnull()
df.loc[null_idx, 'Col3'] = df['Col2']
df.loc[null_idx, 'Col2'] = np.nan
The resulting output:
Col1 Col2 Col3
0 A NaN 7.0
1 B NaN 16.0
2 B 16.0 15.0
Try this: (its faster)
df["Col3"], df["Col2"] = np.where(df['Col3'].isnull(), [df["Col2"], df["Col3"]], [df["Col3"], df["Col2"] ])
df
Col1 Col2 Col3
0 A NaN 7.0
1 B NaN 16.0
1 B 16.0 15.0
%timeit df.loc[df['Col3'].isnull(), ['Col2', 'Col3']] = df.loc[df['Col3'].isnull(), ['Col3', 'Col2']].values
100 loops, best of 3: 2.68 ms per loop
%timeit df["Col3"], df["Col2"] = np.where(df['Col3'].isnull(), [df["Col2"], df["Col3"]], [df["Col3"], df["Col2"] ])
1000 loops, best of 3: 592 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With