Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

append two data frame with pandas

Tags:

python

pandas

When I try to merge two dataframes by rows doing:

bigdata = data1.append(data2) 

I get the following error:

Exception: Index cannot contain duplicate values! 

The index of the first data frame starts from 0 to 38 and the second one from 0 to 48. I didn't understand that I have to modify the index of one of the data frame before merging, but I don't know how to.

Thank you.

These are the two dataframes:

data1:

    meta  particle  ratio   area    type     0   2     part10    1.348   0.8365  touching 1   2     part18    1.558   0.8244  single   2   2     part2     1.893   0.894   single   3   2     part37    0.6695  1.005   single   ....clip... 36  2     part23    1.051   0.8781  single   37  2     part3     80.54   0.9714  nuclei   38  2     part34    1.071   0.9337  single   

data2:

    meta  particle  ratio    area    type     0   3     part10    0.4756   1.025   single   1   3     part18    0.04387  1.232   dusts    2   3     part2     1.132    0.8927  single   ...clip... 46  3     part46    13.71    1.001   nuclei   47  3     part3     0.7439   0.9038  single   48  3     part34    0.4349   0.9956  single  

the first column is the index

like image 481
Jean-Pat Avatar asked Oct 15 '11 08:10

Jean-Pat


People also ask

Can we append multiple DataFrames in pandas?

pandas. DataFrame. append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames.

How do I append two DataFrames in pandas with different column names?

It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.

How do I join multiple DataFrames in pandas?

We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .


2 Answers

The append function has an optional argument ignore_index which you should use here to join the records together, since the index isn't meaningful for your application.

like image 165
Wes McKinney Avatar answered Sep 27 '22 21:09

Wes McKinney


You could first identify the index-duplicated (not value) row using groupby method, and then do a sum/mean operation on all the rows with the duplicate index.

data1 = data1.groupby(data1.index).sum() data2 = data2.groupby(data2.index).sum() 
like image 23
Madcat Avatar answered Sep 27 '22 21:09

Madcat