Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging two DataFrames

Tags:

I have 2 DataFrames which I would like to merge. I have looked at the documentation and tried to perform the following operation but an getting confused as to how to do it. Like I said I have 2 DataFrames:

df1:        id        name  type currency 0  BTA.S   Applewood  Hard      GBp 1  VOD.S    Softwood  Soft      GBp 

and

df2:     id BTA.S    301.221525 VOD.S    213.791400 

and I would like to return:

      id        name  type currency       price 0  BTA.S   Applewood  Hard      GBp  301.221525 1  VOD.S    Softwood  Soft      GBp  213.791400 

Where the price column from the df2 is merged with df1. (Just to let you know there will be alot more wood types by the time I've finished).

I have tried a few methods of doing this:

Result = df1.merge(df2[['*.S']], left_on='id', right_index=True)  

where I met the exception:

ValueError: can not merge DataFrame with instance of type <class 'pandas.core.series.Series'> 

and

Result = pd.concat([Df1, Df2], axis=1, ignore_index=True) 

where I get the exception:

ValueError: labels ['type'] not contained in axis 

But I am getting confused.

like image 735
Stacey Avatar asked Jun 22 '16 12:06

Stacey


People also ask

How do I merge two data frames?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

How do I merge 10 DataFrames in pandas?

We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .

How do I merge two datasets in Python?

To use merge() , you need to provide at least below two arguments. for example, combining above two datasets without mentioning anything else like- on which columns we want to combine the two datasets. pd. merge() automatically detects the common column between two datasets and combines them on this column.

How do I merge two DataFrames in pandas based on common column?

To merge two Pandas DataFrame with common column, use the merge() function and set the ON parameter as the column name.


2 Answers

The error message indicates that df2 is of type pd.Series. You need to convert df2 .to_frame() as .merge() needs a pd.DataFrame() input (see docs):

df1.merge(df2[['*.S']].to_frame(), left_on='id', right_index=True) 

while you probably also just could:

df1.merge(df2.to_frame(), left_on='id', right_index=True) 

Alternatively, you can use pd.DataFrame.join() which accepts a pd.Series.

like image 93
Stefan Avatar answered Sep 21 '22 04:09

Stefan


This error means that one of your objects is not a pandas Data Frame.

ValueError: can not merge DataFrame with instance of type <class 'pandas.core.series.Series'> 

To prove this to yourself,

print(type(df2)) 

And that should output pandas.core.series.Series

To achieve your desired result,

df2 = df2.to_frame().reset_index() df2.columns = ['id', 'price'] df1.merge(df2) 

Outputs:

    id  name    type    currency    price 0   BTA.S   Applewood   Hard    GBp     301.221525 1   VOD.S   Softwood    Soft    GBp     213.791400 
like image 39
michael_j_ward Avatar answered Sep 19 '22 04:09

michael_j_ward