Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Errors in makeCluster(multicore): cannot open the connection

I have the following question.

Why when submit the job on the standard node (maximum cores 56) everything runs fine, however when I submit the same job/code to the large_memory node (maximum cores 128), I get an error?

Parallelization code in R:

> no_cores <- detectCores() - 1

> cl <- makeCluster(no_cores, outfile=paste0('./info_parallel.log'))

Error

Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  cannot open the connection

Calls: <Anonymous> ... doTryCatch -> recvData -> makeSOCKmaster -> 
  socketConnection

In addition: Warning message:

In socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  localhost:11232 cannot be opened
Execution halted

Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Execution halted

Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode ->  unserialize
Execution halted

As I said, the R code runs fine on the standard nodes, so I assume it is a problem with the large_memory node. What can that be?

like image 343
Helen Liu Avatar asked May 10 '17 18:05

Helen Liu


Video Answer


1 Answers

Finally, I sovled it.

The error was caused by the default limit of connections in R. The default value of connections is 128. Here, "connections" means the number of cores per node, which are used in the code.

While, in the code, the errors happened at this line of "cl <- makeCluster........"

no_cores <- detectCores() - 1

cl <- makeCluster(no_cores, outfile=paste0('./info_parallel.log'))

Here, detectCores() will get the maximum number of cores on the node.

In the standard nodes of the cluster, the number of cores per node is less than 128, That's why the R code can run well on the standard nodes; while, the number of cores per node in large_memory partition is 128 in my case. It reaches the limit number of cores by default. So the error shows as:

cannot open the connection

I tried to set the number of cores as 120 for running jobs on the large_memory node (maximum cores = 128). No errors. The code works well.

cl <- makeCluster( 120, outfile=paste0('./info_parallel.log'))

Thanks!

like image 166
Helen Liu Avatar answered Oct 02 '22 21:10

Helen Liu