For a long time I have been using sfLapply for a lot of my parallel r scripts. However, recently as I have delved more into parallel computing, I have been using sfClusterApplyLB, which can save a lot of time if individual instances do not take the same amount of time to run. Were as sfLapply will wait for each instance of a batch to finish before loading a new batch (which may lead to idle instances), with sfClusterApplyLB instances that complete their task will immediately be assigned to remaining elements in the list thus potentially saving quite a bit of time when instances do not take precisely the same amount of time. This has led me to question why would we ever want to NOT load balance our runs when using snowfall? The only thing I have found so far is that, when there is an error in the paralleled script, sfClusterApplyLB will still cycle through the entire list before giving an error, while sfLapply will stop after trying the first batch. What else am I missing? are there any other costs/ downsides of load balancing? Below is an example code that shows the difference between the two
rm(list = ls()) #remove all past worksheet variables
working_dir="D:/temp/"
setwd(working_dir)
n_spp=16
spp_nmS=paste0("sp_",c(1:n_spp))
spp_nm=spp_nmS[1]
sp_parallel_run=function(sp_nm){
sink(file(paste0(working_dir,sp_nm,"_log.txt"), open="wt"))#######NEW
cat('\n', 'Started on ', date(), '\n')
ptm0 <- proc.time()
jnk=round(runif(1)*8000000) #this is just a redundant script that takes an arbitrary amount of time to run
jnk1=runif(jnk)
for (i in 1:length(jnk1)){
jnk1[i]=jnk[i]*runif(1)
}
ptm1=proc.time() - ptm0
jnk=as.numeric(ptm1[3])
cat('\n','It took ', jnk, "seconds to model", sp_nm)
#stop sinks
sink.reset <- function(){
for(i in seq_len(sink.number())){
sink(NULL)
}
}
sink.reset()
}
require(snowfall)
cpucores=as.integer(Sys.getenv('NUMBER_OF_PROCESSORS'))
sfInit( parallel=T, cpus=cpucores) #
sfExportAll()
system.time((sfLapply(spp_nmS,fun=sp_parallel_run)))
sfRemoveAll()
sfStop()
sfInit( parallel=T, cpus=cpucores) #
sfExportAll()
system.time(sfClusterApplyLB(spp_nmS,fun=sp_parallel_run))
sfRemoveAll()
sfStop()
The sfLapply
function is useful because it splits up the input values into one group of tasks for each available worker, which is what the mclapply
function calls prescheduling. This can give much better performance than sfClusterApplyLB
when the tasks don't take long.
Here's an extreme example that demonstrates the advantages of prescheduling:
> system.time(sfLapply(1:100000, sqrt))
user system elapsed
0.148 0.004 0.170
> system.time(sfClusterApplyLB(1:100000, sqrt))
user system elapsed
19.317 1.852 21.222
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With