I used fread
with foreach
and doParallel
package in R 3.2.0 in ubuntu 14.04. The following code works just fine, even though I didn't use registerDoParallel
.
library(foreach)
library(doParallel)
library(data.table)
write.csv(iris,'test.csv',row.names=F)
cl<-makeCluster(4)
tmp<-foreach(i=1:10) %dopar% { t <- fread('test.csv') }
tmp<-rbindlist(tmp)
stopCluster(cl)
However, when switching to Windows 7 it no longer works, with or without 'registerDoParallel'.
library(foreach)
library(doParallel)
#library(doSNOW)
library(data.table)
write.csv(iris,'test.csv',row.names=F)
cl<-makeCluster(4)
registerDoParallel(cl)
#registerDoSNOW(cl)
tmp<-foreach(i=1:10) %dopar% { t <- fread('test.csv') }
tmp<-rbindlist(tmp)
stopCluster(cl)
'doSNOW' package doesn't work either. Below is the error message.
Error in { : task 1 failed - "could not find function "fread""
Does anyone have any similar experience?
A follow up question is regarding nested foreach
. It seems the following won't work.
cl<-makeCluster(4)
registerDoParallel(cl)
clusterEvalQ(cl , library(data.table))
tmp<-foreach(j=1:10) %dopar% {
tmp1<-foreach(i=1:10) %dopar% {
t<-fread('test.csv',data.table=T)
}
rbindlist(tmp1)
}
stopCluster(cl)
Thanks to user20650
for the reference in here. Basically it can be solved by setting .export='fread'
in the foreach
function.
More precisely, the following will fix the problem.
tmp<-foreach(i=1:10,.export = 'fread') %dopar% {
t <- fread('test.csv',data.table=T)
}
To my follow up question regarding nested foreach
, user20650
answered it in his comments. Namely,adding clusterEvalQ(cl , c(library(data.table),library(foreach)))
. The following code seems to work both in ubuntu and windows.
cl<-makeCluster(4)
registerDoParallel(cl)
clusterEvalQ(cl , c(library(data.table),library(foreach)))
tmp<-foreach(j=1:10) %dopar% {
tmp1<-foreach(i=1:10) %dopar% { t <- fread('test.csv',data.table=T) }
rbindlist(tmp1)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With