I am using foreach with a .combine = rbindlist. This does not appear to work, although it works fine if I use .combine = rbind.
Just to illustrate using a simple example --
> t2 <- data.table(col1=c(1,2,3))
> foreach (i=1:3, .combine=rbind) %dopar% unique(t2)
col1
1: 1
2: 2
3: 3
4: 1
5: 2
6: 3
7: 1
8: 2
9: 3
# But using rbindlist gives an error
> foreach (i=1:3, .combine=rbindlist) %dopar% unique(t2)
error calling combine function:
<simpleError in fun(result.1, result.2): unused argument(s) (result.2)>
NULL
Has anyone been able to make this work ?
Thanks in advance.
It's basically what you said - rbindlist
assumes a list
argument, and the error you're getting is the same as this one:
result.1 = data.table(blah = 23)
result.2 = data.table(blah = 34)
rbindlist(result.1, result.2)
#Error in rbindlist(result.1, result.2) : unused argument (result.2)
If you want to utilize rbindlist
, the way to do it would be this:
rbindlist(foreach (i = 1:3) %dopar% unique(t2))
or this:
foreach (i=1:3, .combine=function(x,y)rbindlist(list(x,y))) %dopar% unique(t2)
Here's a way to both use rbindlist as your .combine
function and have .multicombine=TRUE
:
foreach (i=1:3,
.combine=function(...) rbindlist(list(...)),
.multicombine=TRUE) %dopar% unique(t2)
If you have a decent amount of seperate results to aggregate, this could be quite a bit faster than only combining two-at-a-time.
For a single foreach statement, this produces the same result as letting foreach
default .combine
to list and wrapping with rbindlist, as in eddi's first solution. I'm not sure which is faster, though I would expect them to be close.
For small, single-foreach
jobs I like wrapping with rbindlist
, but when chaining several foreach
's together with %:%
I think the above approach (likely in the first foreach
) looks cleaner.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With