I have a list of data frame that looks like this:
ls[[1]]
[[1]]
month year oracle
1 2004 356.0000
2 2004 390.0000
3 2004 394.4286
4 2004 391.8571
ls[[2]]
[[2]]
month year microsoft
1 2004 339.0000
2 2004 357.7143
3 2004 347.1429
4 2004 333.2857
How do I create a single data frame that looks like this:
month year oracle microsoft
1 2004 356.0000 339.0000
2 2004 390.0000 357.7143
3 2004 394.4286 347.1429
4 2004 391.8571 333.2857
We could also use Reduce
Reduce(function(...) merge(..., by = c('month', 'year')), lst)
Using @Jaap's example, if the values are not the same, use all=TRUE
option from merge
.
Reduce(function(...) merge(..., by = c('month', 'year'), all=TRUE), ls)
# month year oracle microsoft google
#1 1 2004 356.0000 NA NA
#2 2 2004 390.0000 339.0000 NA
#3 3 2004 394.4286 357.7143 390.0000
#4 4 2004 391.8571 347.1429 391.8571
#5 5 2004 NA 333.2857 357.7143
#6 6 2004 NA NA 333.2857
Using the Reduce
/merge
code from @akrun's answer will work great if the values for the month
and year
columns are the same for each dataframe. However, when they are not the same (example data at the end of this answer)
Reduce(function(...) merge(..., by = c('month', 'year')), ls)
will return only the rows which are common in each dataframe:
month year oracle microsoft google
1 3 2004 394.4286 357.7143 390.0000
2 4 2004 391.8571 347.1429 391.8571
In that case, you can either use all=TRUE
(as shown by @akrun) or use full_join
from the dplyr
package as an alternative when you want to include all rows/observations:
library(dplyr)
Reduce(function(...) full_join(..., by = c('month', 'year')), ls)
# or just:
Reduce(full_join, ls)
this will result in:
month year oracle microsoft google
1 1 2004 356.0000 NA NA
2 2 2004 390.0000 339.0000 NA
3 3 2004 394.4286 357.7143 390.0000
4 4 2004 391.8571 347.1429 391.8571
5 5 2004 NA 333.2857 357.7143
6 6 2004 NA NA 333.2857
Used data:
ls <- list(structure(list(month = 1:4, year = c(2004L, 2004L, 2004L, 2004L), oracle = c(356, 390, 394.4286, 391.8571)), .Names = c("month", "year", "oracle"), class = "data.frame", row.names = c(NA, -4L)),
structure(list(month = 2:5, year = c(2004L, 2004L, 2004L, 2004L), microsoft = c(339, 357.7143, 347.1429, 333.2857)), .Names = c("month", "year", "microsoft"), class = "data.frame", row.names = c(NA,-4L)),
structure(list(month = 3:6, year = c(2004L, 2004L, 2004L, 2004L), google = c(390, 391.8571, 357.7143, 333.2857)), .Names = c("month", "year", "google"), class = "data.frame", row.names = c(NA,-4L)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With