Say two dataframe:
df1 = pd.DataFrame({'A': ['foo', 'bar', 'test'], 'b': [1, 2, 3], 'c': [3, 4, 5]})
df2 = pd.DataFrame({'A': ['foo', 'baz'], 'c': [2, 1]})
df1
A b c
0 foo 1 3
1 bar 2 4
2 test 3 5
df2
A c
0 foo 2
1 baz 1
After merging I want:
df1
A b c
0 foo 1 3
1 bar 2 4
2 test 3 5
3 baz NaN 1
If df1['A']
does not contain any thing from df2['A']
, only that row(s) from df2
need to be added to df1
. Ignore other columns when there is a match in col A
.
I tried pd.merge(df1, df2, on=['A'], how='outer')
, but this does not give expected output.
Additionally, for future reference, I also want to know how to get the below output:
df1
A b c
0 foo 1 2
1 bar 2 4
2 test 3 5
3 baz NaN 1
See the updated col c
value from df2
for foo
.
Try with combine_first
out = df1.set_index('A').combine_first(df2.set_index('A')).reset_index()
A b c
0 bar 2.0 4
1 baz NaN 1
2 foo 1.0 3
3 test 3.0 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With