Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in parallel R: Error in serialize(data, node$con) : error writing to connection

I've seen a few other posts on this topic, and none seemed to be quite the same as the problem I'm having. But here goes:

I'm running a function in parallel using

cores <- detectCores() cl <- makeCluster(8L,outfile="output.txt") registerDoParallel(cl) x <- foreach(i = 1:length(y), .combine='list',.packages=c('httr','jsonlite'), .multicombine=TRUE,.verbose=F,.inorder=F) %dopar% {function(y[i])}

This often works fine, but is now throwing the error:

Error in serialize(data, node$con) : error writing to connection

Upon examination of the output.txt file I see:

starting worker pid=11112 on localhost:11828 at 12:38:32.867
starting worker pid=10468 on localhost:11828 at 12:38:33.389
starting worker pid=4996 on localhost:11828 at 12:38:33.912
starting worker pid=3300 on localhost:11828 at 12:38:34.422
starting worker pid=10808 on localhost:11828 at 12:38:34.937
starting worker pid=5840 on localhost:11828 at 12:38:35.435
starting worker pid=8764 on localhost:11828 at 12:38:35.940
starting worker pid=7384 on localhost:11828 at 12:38:36.448
Error in unserialize(node$con) : embedded nul in string: '\0\0\0\006SYMBOL\0\004\0\t\0\0\0\003')'\0\004\0\t\0\0\0\004expr\0\004\0\t\0\0\0\004expr\0\004\0\t\0\0\0\003','\0\004\0\t\0\0\0\024SYMBOL_FUN'
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -
unserialize
Execution halted

This error is intermittent. Memory is plentiful (32GB), and no other large R objects are in memory. The function in the parallel code retrieves a number of small json data objects from the cloud and puts them into an R object - so there are no large data files. I don't know why it occasionally sees an embedded nul and stops.

I have a similar problem with a function that pulls csv files from the cloud as well. Both functions worked fine under R 3.3.0 and R 3.4.0 until now.

I'm using R 3.4.1 and RStudio 1.0.143 on Windows.

Here's my sessionInfo

sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RJSONIO_1.3-0     RcppBDT_0.2.3     zoo_1.8-0         data.table_1.10.4 
doParallel_1.0.10 iterators_1.0.8  
[7] RQuantLib_0.4.2   foreach_1.4.3     httr_1.2.1       

loaded via a namespace (and not attached):
[1] Rcpp_0.12.12     lattice_0.20-35  codetools_0.2-15 grid_3.4.1       
R6_2.2.2         jsonlite_1.5     tools_3.4.1     
[8] compiler_3.4.1

UPDATE

Now I get another similar error:

Error in unserialize(node$con) : ReadItem: unknown type 100, perhaps written by later version of R

The embedded nul error seems to have vanished. I've also tried deleting .Rhistory and .Rdata and also deleting my packages subfolder and reloading all pacakges. At least this new error seems consistent. I can't find what "unknown type 100" is.

like image 784
JK_chitown Avatar asked Jul 28 '17 18:07

JK_chitown


People also ask

Why does the parallelization fail?

I guess the parallel fails may be caused by the same reason, just different error message. Each core you assign consumes memory. So the more cores means more memory is being demanded and as soon you run out of it, you will receive this error. So my suggestion is to reduce the number of cores for Parallelization.

What is parallelisation in foreach?

Parallelisation as done by foreach is a space vs. time trade-off. We get faster execution at the expense of higher memory usage. The reason for the higher memory usage is that several R process are started and each of them needs it’s own memory to hold the data necessary for the calculation. Currently foreach is using an implicit PSOCK cluster.

Why are my parallel processes dying?

On Sun, Dec 16, 2018 at 5:12 PM MichelNivard ***@***.***> wrote: THis may mean your parallel processes "died" because they ran out of memory, you ave 32GB for 11 processes (our default is Ncores-1 I think) so less then 3GB per core and you need some memory to run windows/linux.


2 Answers

I also noticed that multi-core sessions don't go away from the task manager.

Switching from using:stopCluster(cl) to stopImplicitCluster() Worked for me. From my reading, this is supposed to be used when using a "one line" registerDoParallel(cores=x) vs

cl<-makeCluster(x)
registerDoParallel(cl)

My "gut feeling" is that how Windows handles the clusters requires the stopImplicitCluster, but your experience may vary.

I would have commented but this is (cue band) MY FIRST STACKOVERFLOW POST!!!

like image 169
naturalBlogarithm Avatar answered Oct 02 '22 00:10

naturalBlogarithm


I get a similar error... usually happens on a subsequent script run when one of my previous scripts errored out or I stopped it early. This could be the part where you mention: " I don't know why it occasionally sees an embedded nul and stops" which could be the error.

This has some good info, especially to make sure to leave 1 core for regular windows processes to run. Also mentions "If you get an error from either of those functions, it usually means that at least one of the workers has died" which could back up my theory about crashing after error.

doParallel error in R: Error in serialize(data, node$con) : error writing to connection

So far, my solution has been to re-initialize the parallel backend by running this again:

registerDoParallel(cl)

It usually works afterwards but I do notice that the previous multi-core sessions in my task manager do not go away, even with:

stopCluster(cl)

This is why I sometimes restart R.

like image 37
BigTimeStats Avatar answered Oct 02 '22 00:10

BigTimeStats