I want to run a parallel for loop. I need each of my processes to have access to 2 large dictionaries, gene_dict and transcript_dict. This is what I tried first
@everywhere( function EM ... end )
generefs = [ @spawnat i genes for i in 2:nprocs()]
dict1refs = [ @spawnat i gene_dict for i in 2:nprocs()]
dict2refs = [ @spawnat i transcript_dict for i in 2:nprocs()]
result = @parallel (vcat) for i in 1:length(genes)
EM(genes[i], gene_dict, transcript_dict)
end
but I get the following error on all processes (not just on 5):
exception on 5: ERROR: genes not defined
in anonymous at no file:1514
in anonymous at multi.jl:1364
in anonymous at multi.jl:820
in run_work_thunk at multi.jl:593
in run_work_thunk at multi.jl:602
in anonymous at task.jl:6
UndefVarError(:genes)
I thought @spawnat would move the three data structures I need to all of the processes. My first thought is maybe this move takes awhile and the parallel for loop tries to run before the data transfer is complete.
The data is moved by @spawnat but it is not bound to variables with the same name as the name on the master node. Instead the data is saved in the fairly hidden Dict named Base.PGRP on the workers. To access the values, you'll have to fetch the RemoteRefs which in your case would be something like
result = @parallel (vcat) for i in 1:length(genes)
EM(fetch(genes[i]), fetch(gene_dict[i]), fetch(transcript_dict[i]))
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With