When I try to merge two dataframes by rows doing:
bigdata = data1.append(data2)   I get the following error:
Exception: Index cannot contain duplicate values!
The index of the first data frame starts from 0 to 38 and the second one from 0 to 48. I didn't understand that I have to modify the index of one of the data frame before merging, but I don't know how to.
Thank you.
These are the two dataframes:
data1:
    meta  particle  ratio   area    type     0   2     part10    1.348   0.8365  touching 1   2     part18    1.558   0.8244  single   2   2     part2     1.893   0.894   single   3   2     part37    0.6695  1.005   single   ....clip... 36  2     part23    1.051   0.8781  single   37  2     part3     80.54   0.9714  nuclei   38  2     part34    1.071   0.9337  single     data2:
    meta  particle  ratio    area    type     0   3     part10    0.4756   1.025   single   1   3     part18    0.04387  1.232   dusts    2   3     part2     1.132    0.8927  single   ...clip... 46  3     part46    13.71    1.001   nuclei   47  3     part3     0.7439   0.9038  single   48  3     part34    0.4349   0.9956  single    the first column is the index
pandas. DataFrame. append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames.
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes. Merging multiple Dataframes is similar to SQL join and supports different types of join inner , left , right , outer , cross .
The append function has an optional argument ignore_index which you should use here to join the records together, since the index isn't meaningful for your application.
You could first identify the index-duplicated (not value) row using groupby method, and then do a sum/mean operation on all the rows with the duplicate index.
data1 = data1.groupby(data1.index).sum() data2 = data2.groupby(data2.index).sum() 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With