Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concat two DataFrames on missing indices

Tags:

python

pandas

I have two DataFrames and want to use the second one only on the rows whose index is not already contained in the first one.

What is the most efficient way to do this?

Example:

df_1
idx     val
0      0.32
1      0.54
4      0.26
5      0.76
7      0.23

df_2
idx     val
1     10.24
2     10.90
3     10.66
4     10.25
6     10.13
7     10.52

df_final
idx     val
0      0.32
1      0.54
2     10.90
3     10.66
4      0.26
5      0.76
6     10.13
7      0.23

Recap: I need to add the rows in df_2 for which the index is not already in df_1.


EDIT

Removed some indices in df_2 to illustrate the fact that all indices from df_1 are not covered in df_2.

like image 699
Jivan Avatar asked Feb 20 '17 12:02

Jivan


1 Answers

You can use reindex with combine_first or fillna:

df = df_1.reindex(df_2.index).combine_first(df_2)
print (df)
       val
idx       
0     0.32
1     0.54
2    10.90
3    10.66
4     0.26
5     0.76
6    10.13
7     0.23

df = df_1.reindex(df_2.index).fillna(df_2)
print (df)
       val
idx       
0     0.32
1     0.54
2    10.90
3    10.66
4     0.26
5     0.76
6    10.13
7     0.23
like image 135
jezrael Avatar answered Sep 22 '22 22:09

jezrael