I want to run a parallel for loop. I need each of my processes to have access to 2 large dictionaries, gene_dict
and transcript_dict
. This is what I tried first
@everywhere( function EM ... end )
generefs = [ @spawnat i genes for i in 2:nprocs()]
dict1refs = [ @spawnat i gene_dict for i in 2:nprocs()]
dict2refs = [ @spawnat i transcript_dict for i in 2:nprocs()]
result = @parallel (vcat) for i in 1:length(genes)
EM(genes[i], gene_dict, transcript_dict)
end
but I get the following error on all processes (not just on 5):
exception on 5: ERROR: genes not defined
in anonymous at no file:1514
in anonymous at multi.jl:1364
in anonymous at multi.jl:820
in run_work_thunk at multi.jl:593
in run_work_thunk at multi.jl:602
in anonymous at task.jl:6
UndefVarError(:genes)
I thought @spawnat
would move the three data structures I need to all of the processes. My first thought is maybe this move takes awhile and the parallel for loop tries to run before the data transfer is complete.
The data is moved by @spawnat
but it is not bound to variables with the same name as the name on the master node. Instead the data is saved in the fairly hidden Dict
named Base.PGRP
on the workers. To access the values, you'll have to fetch
the RemoteRef
s which in your case would be something like
result = @parallel (vcat) for i in 1:length(genes)
EM(fetch(genes[i]), fetch(gene_dict[i]), fetch(transcript_dict[i]))
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With