Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Cannot open the connection" - HPC in R with snow

Tags:

r

hpc

snow

I'm attempting to run a parallel job in R using snow. I've been able to run extremely similar jobs with no trouble on older versions of R and snow. R package dependencies prevent me from reverting.

What happens: My jobs terminate at the parRapply step, i.e., the first time the nodes have to do anything short of reporting Sys.info(). The error message reads:

Error in checkForRemoteErrors(val) : 
3 nodes produced errors; first error: cannot open the connection 
Calls: parRapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors

Specs: R 2.14.0, snow 0.3-8, RedHat Enterprise Linux Client release 5.6. The snow package has been built on the correct version of R.

Details: The following code appears to execute fine:

cl <- makeCluster(3)
clusterEvalQ(cl,library(deSolve,lib="~/R/library"))
clusterCall(cl,function() Sys.info()[c("nodename","machine")])

I'm an end-user, not a system admin, but I'm desperate for suggestions and insights into what could be going wrong.

like image 343
Sarah Avatar asked Nov 21 '11 21:11

Sarah


1 Answers

This cryptic error appeared because an input file that's requested during program execution wasn't actually present. Each node would attempt to load this file and then fail, but this would result only in a "cannot open the connection" message.

What this means is that almost anything can cause a "connection" error. Incredibly annoying!

like image 103
Sarah Avatar answered Sep 21 '22 21:09

Sarah