I'm trying to use the Reduce
function in R to use the merge
function across multiple dataframes. The problem is, I would like to use the merge function with the argument all=T
, and there seems to be nowhere to specify this in the higher-order Reduce
function.
So I'd like:
a <- data.frame(id=c(1, 2, 3, 4), a=c('a', 'b', 'c', 'd'))
b <- data.frame(id=c(1, 2, 5, 6), b=c('a', 'b', 'e', 'f'))
c <- data.frame(id=c(3, 4, 5, 6), c=c('c', 'd', 'e', 'f'))
out <- Reduce(merge, list(a, b, c), all=T)
out
id a b c
1 1 a a <NA>
2 2 b b <NA>
3 3 c <NA> c
4 4 d <NA> d
5 5 <NA> e e
6 6 <NA> e e
But because merge
defaults to all=F
, what I'm getting is:
[1] id a b c
<0 rows> (or 0-length row.names)
As far as I know, Reduce
can not handle extra parameters to be passed to the function parameter yet. But you can redefine the merge
function with customized parameters and pass it as an anonymous function to Reduce
:
Reduce(function(x, y) merge(x, y, by = "id", all = T), list(a, b, c))
# id a b c
#1 1 a a <NA>
#2 2 b b <NA>
#3 3 c <NA> c
#4 4 d <NA> d
#5 5 <NA> e e
#6 6 <NA> f f
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With