Consider the following DataFrame X:
Col A Col B
1 2
3 4
5 6
And the DataFrame Y:
Col A Col B
3 7
8 9
Does there exist a built in function in pandas that will Combine the two dataframes, using Col A as keys and updating value in Col B if it exists, otherwise append. Such that the output of this function on X and Y is
Col A Col B
1 2
3 7
5 6
8 9
I've looked into merge and update and append but they don't seem to act the way I want, update updates by index instead of Col A value, merge doesn't overwrite, ect. Thanks!
One way to do this is to concat
then drop the duplicates:
In [11]: df = pd.concat([dfX, dfY])
In [12]: df
Out[12]:
ColA ColB
0 1 2
1 3 4
2 5 6
0 3 7
1 8 9
In [13]: df.drop_duplicates(cols=['ColA'], take_last=True)
Out[13]:
ColA ColB
0 1 2
2 5 6
0 3 7
1 8 9
Note: the take_last
argument means you are "updating from dfY".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With