Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge/join empty dataframe with another filled dataframe by equal indices and column names?

I want to combine two dataframes. One dataframe, let's say Empty_DF, is empty and has big size (320 columns by 240 rows) with indexes and column names just integers. The other one,ROI_DF, is smaller and filled and matches at a certain location the indexes and column names.

I have tried to use the pandas.merge function as it was suggested in this question; however, it would just append the columns to the empty dataframe Empty_DF and not replacing the values.

Empty_DF = pd.DataFrame({'a':[0,0,0,0,0,0],
            'b':[0,0,0,0,0,0], 'b':[0,0,0,0,0,0]}, index=list('abcdef'))

print (Empty_DF)

 ROI_DF= pd.DataFrame({'a':range(4),
            'b':[5,6,7,8]}, index=list('abce'))

 print(ROI_DF)
   a  b  c
a  0  0  0
b  0  0  0
c  0  0  0
d  0  0  0
e  0  0  0
f  0  0  0

In this example, it is sufficient since the dataframe is small and the pandas.fillna option with pandas.drop can be used. Is there a more efficient way of optimizing this to bigger dataframes?

df3 = pd.merge(Empty_DF, ROI_DF, how='left', left_index=True, 
right_index=True, suffixes=('_x', ''))
df3['a'].fillna(df3['a_x'], inplace=True)
df3['b'].fillna(df3['b_x'], inplace=True)
df3.drop(['a_x', 'b_x'], axis=1, inplace=True)

print(df3)
 a  b c
a  0  5 0
b  1  6 0
c  2  7 0
d  0  0 0
e  3  8 0
f  0  0 0
like image 673
FABeng Avatar asked Oct 07 '19 20:10

FABeng


People also ask

How do I merge two data frames?

Joining DataFrames Another way to combine DataFrames is to use columns in each dataset that contain common values (a common unique id). Combining DataFrames using a common field is called “joining”. The columns containing the common values are called “join key(s)”.


3 Answers

This is perfect case for DataFrame.update, which aligns on indices

Empty_DF.update(ROI_DF)

Output

print(df3)

     a    b  c
a  0.0  5.0  0
b  1.0  6.0  0
c  2.0  7.0  0
d  0.0  0.0  0
e  3.0  8.0  0
f  0.0  0.0  0

Note that update is in place, as quoted from the documentation:

Modify in place using non-NA values from another DataFrame.

That means that your original dataframe will be updated by the new values. To prevent this, use:

df3 = Empty_DF.copy()
df3.update(ROI_DF)
like image 72
Erfan Avatar answered Oct 11 '22 17:10

Erfan


You can either use update:

Empty_DF.update(ROI_DF)

output:

     a    b  c
a  0.0  5.0  0
b  1.0  6.0  0
c  2.0  7.0  0
d  0.0  0.0  0
e  3.0  8.0  0
f  0.0  0.0  0

Or loc:

Empty_DF.loc[ROI_DF.index, ROI_DF.columns] = ROI_DF

output:

   a  b  c
a  0  5  0
b  1  6  0
c  2  7  0
d  0  0  0
e  3  8  0
f  0  0  0
like image 2
Quang Hoang Avatar answered Oct 11 '22 18:10

Quang Hoang


In your case reindex_like

yourdf=ROI_DF.reindex_like(Empty_DF).fillna(0)
     a    b    c
a  0.0  5.0  0.0
b  1.0  6.0  0.0
c  2.0  7.0  0.0
d  0.0  0.0  0.0
e  3.0  8.0  0.0
f  0.0  0.0  0.0
like image 1
BENY Avatar answered Oct 11 '22 17:10

BENY