Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine 2 dataframe and then separate them

I have 2 dataframes with same column headers. I wish to perform hot encoding on both of them. I cannot perform them one by one. I wish to append two dataframe together and then perform hot encoding and then split them into 2 dataframes with headers on each of them again.

Code below perform hot encoding one by one instead of merging them and then hot encode.

train = pd.get_dummies(train, columns= ['is_discount', 'gender', 'city'])
test = pd.get_dummies(test, columns= ['is_discount', 'gender', 'city'])
like image 908
Mervyn Lee Avatar asked Nov 17 '17 13:11

Mervyn Lee


People also ask

Can you combine two Dataframes in pandas?

Pandas' merge and concat can be used to combine subsets of a DataFrame, or even data from different files. join function combines DataFrames based on index or column.

How do I inner join two Dataframes in Python?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.


1 Answers

Use concat with keys then divide i.e

#Example Dataframes 
train = pd.DataFrame({'x':[1,2,3,4]})
test = pd.DataFrame({'x':[4,2,5,0]})

# Concat with keys
temp = pd.get_dummies(pd.concat([train,test],keys=[0,1]), columns=['x'])

# Selecting data from multi index 
train,test = temp.xs(0),temp.xs(1)

Output :

#Train 
  x_0  x_1  x_2  x_3  x_4  x_5
0    0    1    0    0    0    0
1    0    0    1    0    0    0
2    0    0    0    1    0    0
3    0    0    0    0    1    0

#Test
   x_0  x_1  x_2  x_3  x_4  x_5
0    0    0    0    0    1    0
1    0    0    1    0    0    0
2    0    0    0    0    0    1
3    1    0    0    0    0    0
like image 197
Bharath Avatar answered Sep 24 '22 07:09

Bharath