I've got a few test pieces of code that I've been running on various machines, always with the same results. I thought the philosophy behind the various do... packages was that they could be used interchangeably as a backend for foreach's %dopar%. Why is this not the case?
For example, this code snippet works:
library(plyr)
library(doMC)
registerDoMC()
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
While each of these code snippets fail:
library(plyr)
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopWorkers(workers)
library(plyr)
library(snow)
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopCluster(cl)
library(plyr)
library(doMPI)
cl <- startMPIcluster(count = 2)
registerDoMPI(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
closeCluster(cl)
In all four cases, foreach(i = 1:3,.combine = "c") %dopar% {sqrt(i)}
yields the exact same result, so I know I have the packages installed and working properly on each machine I've tested them on.
What is doMC doing differently from doSMP, doSNOW, and doMPI?
doMC
forks the current R process so it inherits all the existing variables. All the other do backends only pass on explicitly requested variables. Unfortunately I didn't realise that, and only tested with doMC
- this is something I hope to fix in the next version of plyr.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With