Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to join pandas dataframe on string type

I have two DataFrames objects whose columns are as below

Dataframe 1:

df.dtypes

Output:

ImageID       object
Source        object
LabelName     object
Confidence     int64
dtype: object

Dataframe 2:

a.dtypes

Output:

LabelName       object
ReadableName    object
dtype: object

Here, i am trying to combine these two dataframes as below

combined =  df.join(a,on='LabelName')

But, i am getting the following error

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

But, i am merging them on LabelName, which has only strings (object datatype)

Am i missing something here?

like image 492
InAFlash Avatar asked Oct 20 '18 06:10

InAFlash


2 Answers

About the on parameter, the documentation says:

Column or index level name(s) in the caller to join on the index in other, otherwise joins index-on-index.

Note that join() always uses other.index. You can try this:

df.join(a.set_index('LabelName'), on='LabelName')

Or use df.merge() instead.

like image 196
John Zwinck Avatar answered Oct 05 '22 16:10

John Zwinck


There is problem some columns are integers along with string in DataFrame1 while all are strings in DataFrame2 which is causing the problem.

Simplest solution is cast all columns to strings:

pd.merge(df1.astype(str),df2.astype(str), how='outer')

As the Value Error suggesting itself use concat:

pd.concat([df1, df2])
like image 25
Karn Kumar Avatar answered Oct 05 '22 14:10

Karn Kumar