Possible Duplicate:
Merge multiple data frames in a list simultaneously
example data.frames
:
df1 = data.frame(id=c('1','73','2','10','43'),v1=c(1,2,3,4,5)) <br> df2 = data.frame(id=c('7','23','57','2','62','96'),v2=c(1,2,3,4,5,6)) <br> df3 = data.frame(id=c('23','62'),v3=c(1,2)) <br>
Note: id
is unique for each data.frame. I want the resulting matrix to look like
1 1 NA NA 2 3 4 NA 7 NA 1 NA 10 4 NA NA 23 NA 2 1 43 5 NA NA 57 NA 3 NA 62 NA 5 2 73 2 NA NA 96 NA 6 NA
In this case, I only show 3 datasets, I actually have at least 22 of them so at the end I want a matrix of nx(22+1) where n is the number of ids for all 22 datasets.
Given 2 datasets, I need to get their ids
in the first column and 2nd and 3rd columns are filled with the values, if there is no value exists, then input NA
instead.
To concatenate DataFrames, use the concat() method, but to ignore duplicates, use the drop_duplicates() method.
The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
Pandas merge() function is used to merge multiple Dataframes. We can use either pandas. merge() or DataFrame. merge() to merge multiple Dataframes.
Put them into a list
and use merge
with Reduce
Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3)) # id v1 v2 v3 # 1 1 1 NA NA # 2 10 4 NA NA # 3 2 3 4 NA # 4 43 5 NA NA # 5 73 2 NA NA # 6 23 NA 2 1 # 7 57 NA 3 NA # 8 62 NA 5 2 # 9 7 NA 1 NA # 10 96 NA 6 NA
You can also use this more concise version:
Reduce(function(...) merge(..., all=TRUE), list(df1, df2, df3))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With