Pandas DataFrame updating Column values with other DataFrame

Question

Consider the following DataFrame X:

Col A Col B 
1     2
3     4
5     6

And the DataFrame Y:

Col A Col B 
3     7
8     9

Does there exist a built in function in pandas that will Combine the two dataframes, using Col A as keys and updating value in Col B if it exists, otherwise append. Such that the output of this function on X and Y is

Col A Col B
1     2
3     7
5     6
8     9

I've looked into merge and update and append but they don't seem to act the way I want, update updates by index instead of Col A value, merge doesn't overwrite, ect. Thanks!

Andy Hayden · Accepted Answer

One way to do this is to concat then drop the duplicates:

In [11]: df = pd.concat([dfX, dfY])

In [12]: df
Out[12]:
   ColA  ColB
0     1     2
1     3     4
2     5     6
0     3     7
1     8     9

In [13]: df.drop_duplicates(cols=['ColA'], take_last=True)
Out[13]:
   ColA  ColB
0     1     2
2     5     6
0     3     7
1     8     9

Note: the take_last argument means you are "updating from dfY".

Pandas DataFrame updating Column values with other DataFrame

Tags:

pandas

TheoretiCAL

1 Answers

Andy Hayden

Recent Activity

Donate For Us

Pandas DataFrame updating Column values with other DataFrame

Tags:

pandas

TheoretiCAL

1 Answers

Andy Hayden

Related questions

Recent Activity

Donate For Us