When I try to merge two dataframes by rows doing:
bigdata = data1.append(data2)
I get the following error:
Exception: Index cannot contain duplicate values!
The index of the first data frame starts from 0 to 38 and the second one from 0 to 48. I didn't understand that I have to modify the index of one of the data frame before merging, but I don't know how to.
Thank you.
These are the two dataframes:
data1
:
meta particle ratio area type 0 2 part10 1.348 0.8365 touching 1 2 part18 1.558 0.8244 single 2 2 part2 1.893 0.894 single 3 2 part37 0.6695 1.005 single ....clip... 36 2 part23 1.051 0.8781 single 37 2 part3 80.54 0.9714 nuclei 38 2 part34 1.071 0.9337 single
data2
:
meta particle ratio area type 0 3 part10 0.4756 1.025 single 1 3 part18 0.04387 1.232 dusts 2 3 part2 1.132 0.8927 single ...clip... 46 3 part46 13.71 1.001 nuclei 47 3 part3 0.7439 0.9038 single 48 3 part34 0.4349 0.9956 single
the first column is the index
pandas. DataFrame. append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames.
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .
The append
function has an optional argument ignore_index
which you should use here to join the records together, since the index isn't meaningful for your application.
You could first identify the index-duplicated (not value) row using groupby
method, and then do a sum/mean operation on all the rows with the duplicate index.
data1 = data1.groupby(data1.index).sum() data2 = data2.groupby(data2.index).sum()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With