I have a for loop that is something like this:
for (i=1:150000) { tempMatrix = {} tempMatrix = functionThatDoesSomething() #calling a function finalMatrix = cbind(finalMatrix, tempMatrix) }
Could you tell me how to make this parallel ?
I tried this based on an example online, but am not sure if the syntax is correct. It also didn't increase the speed much.
finalMatrix = foreach(i=1:150000, .combine=cbind) %dopar% { tempMatrix = {} tempMatrix = functionThatDoesSomething() #calling a function cbind(finalMatrix, tempMatrix) }
Can any for loop be made parallel? No, not any loop can be made parallel. Iterations of the loop must be independent from each other. That is, one cpu core should be able to run one iteration without any side effects to another cpu core running a different iteration.
The parallel package from R 2.14. 0 and later provides functions for parallel execution of R code on machines with multiple cores or processors or multiple computers. It is essentially a blend of the snow and multicore packages. By default, the doParallel package uses snow-like function- ality.
lapply-based parallelism may be the most intuitively familiar way to parallelize tasks in R because it extend R's prolific lapply function.
Thanks for your feedback. I did look up parallel
after I posted this question.
Finally after a few tries, I got it running. I have added the code below in case it is useful to others
library(foreach) library(doParallel) #setup parallel backend to use many processors cores=detectCores() cl <- makeCluster(cores[1]-1) #not to overload your computer registerDoParallel(cl) finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% { tempMatrix = functionThatDoesSomething() #calling a function #do other things if you want tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } #stop cluster stopCluster(cl)
Note - I must add a note that if the user allocates too many processes, then user may get this error: Error in serialize(data, node$con) : error writing to connection
Note - If .combine
in the foreach
statement is rbind
, then the final object returned would have been created by appending output of each loop row-wise.
Hope this is useful for folks trying out parallel processing in R for the first time like me.
References: http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With