Keeping both DataFrame indexes on merge

Question

I'm sure this question must have already been answered somewhere but I couldn't find an answer that suits my case.
I have 2 pandas DataFrames

a = pd.DataFrame({'A1':[1,2,3], 'A2':[2,4,6]}, index=['a','b','c'])
b = pd.DataFrame({'A1':[3,5,6], 'A2':[3,6,9]}, index=['a','c','d'])

I want to merge them in order to obtain something like

result = pd.DataFrame({
    'A1' : [3,2,5,6],
    'A2' : [3,4,6,9]
}, index=['a','b','c','d'])

Basically, I want a new df with the union of both indexes. Where indexes match, the value in each column should be updated with the one from the second df (in this case b). Where there is no match the value is taken from the starting df (in this case a).
I tried with merge(), join() and concat() but I could not manage to obtain this result.

sophocles · Accepted Answer

You could use pd.concat to create one dataframe (b being the first one as it is b that has a priority for it's values to be kept over a), and then drop the duplicated index:

Using your sample data:

c = pd.concat([b,a])
c[~c.index.duplicated()].sort_index()

prints:

Keeping both DataFrame indexes on merge

Tags:

python

database

pandas

dataframe

Mirko

1 Answers

sophocles

Recent Activity

Donate For Us

Keeping both DataFrame indexes on merge

Tags:

python

database

pandas

dataframe

Mirko

1 Answers

sophocles

Related questions

Recent Activity

Donate For Us