Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas merge two datasets with same number of rows

Tags:

python

pandas

I have two tables with same number of rows (second table is computed from first one by processing of text inside T1). I have both of them stored as pandas dataframe. T2 is no common column with T1. This is example because my tables are huge:

T1:
| name  | street  | city   |
|-------|---------|--------|
| David | street1 | Prague |
| John  | street2 | Berlin |
| Joe   | street3 | London |

T2:
| computed1 | computed2 |
|-----------|-----------|
| 0.5       | 0.3       |
| 0.2       | 0.8       |
| 0.1       | 0.6       |

Merged:
| name  | street  | city   | computed1 | computed2 |
|-------|---------|--------|-----------|-----------|
| David | street1 | Prague | 0.5       | 0.3       |
| John  | street2 | Berlin | 0.2       | 0.8       |
| Joe   | street3 | London | 0.1       | 0.6       |

I tried these commands:

pd.concat([T1,T2])
pd.merge([T1,T2])
result=T1.join(T1)

With concat and merge I will get only first thousand combined and rest is filled with nan (I double checked that both are same size), and with .join it not combine them because there is nothing in common.

Is there any way how to combine these two tables in pandas?

Thanks

like image 908
stanedav Avatar asked Dec 05 '17 13:12

stanedav


People also ask

How do I merge two DataFrames with the same number of rows?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

How do I merge two DataFrames in Pandas with same rows?

We can use the concat function in pandas to append either columns or rows from one DataFrame to another.

How avoid duplicates in Pandas merge?

In this approach to prevent duplicated columns from joining the two data frames, the user needs simply needs to use the pd. merge() function and pass its parameters as they join it using the inner join and the column names that are to be joined on from left and right data frames in python.


1 Answers

You need reset_index() before concat for default indices:

df = pd.concat([T1.reset_index(drop=True),T2.reset_index(drop=Tru‌​e)], axis=1)
like image 187
jezrael Avatar answered Oct 26 '22 00:10

jezrael