Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Error in unserialize" - foreach/doSNOW/snow with SOCK (windows)

I'm running a parallel operation using a SOCK cluster with workers on the local machine. If I limit the set I'm iterating over (in one test using 70 instead of a full 135 tasks) then everything works just fine. If I go for the full set, I get the errror "Error in unserialize(socklist[[n]]) : error reading from connection".

  • I've unblocked the port in Windows Firewall (both in/out) and allow all access for Rscript/R.

  • It can't be a timeout issue because the socket timeout is set to 365 days.

  • Its not an issue with any particular task because I can run sequentially just fine (also runs fine in parallel if I split the dataset in half and do two separate parallel runs)

  • The best I can come up with is that there is too much data being transferred over the sockets. There doesn't seem to be a cluster option to throttle data limits.

I'm at a loss on how to proceed. Has anyone seen this issue before or can suggest a fix?

Here's the code I'm using to setup the cluster:

cluster = makeCluster( degreeOfParallelism , type = "SOCK" , outfile = "" )
registerDoSNOW( cluster )

Edit
While this issue is constent with the entire dataset, it also appears from time-to-time with a reduced dataset. That might suggest that this isn't simply a data limit issue.

Edit 2
I dug a little deeper and it turns out that my function in fact has a random component that makes it so that sometimes a task will raise an error. If I run the tasks serially then at the end of the operation I'm told which task failed. If I run in parallel, then I get the "unserialize" error. I tried wrapping the code that gets executed by each task in a tryCatch call with error = function(e) { stop(e) } but that also generates the "unserialize" error. I'm confused because I thought that snow handles errors by passing them back to the master?

like image 223
SFun28 Avatar asked Aug 29 '11 23:08

SFun28


1 Answers

I have reported this issue to the author of SNOW but unfortunately there has been no reply.

Edit
I haven't seen this issue in a while. I moved to Parallel/doParallel. Also, I'm now using try() to wrap any code that gets executed in parallel. I can't repro the original issue.

like image 108
SFun28 Avatar answered Nov 02 '22 20:11

SFun28