Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shared array usage in Julia

I need to parallelise a certain task over a number of workers. To that purpose I need all workers to have access to a matrix that stores the data.

I thought that the data matrix could be implemented as a Shared Array in order to minimise data movement.

In order to get me started with Shared Arrays, I am trying the following very simple example which gives me, what I think is, unexpected behaviour:

julia -p 2

# the data matrix
D = SharedArray(Float64, 2, 3)

# initialise the data matrix with dummy values
for ii=1:length(D)
   D[ii] = rand()
end

# Define some kind of dummy computation involving the shared array 
f = x -> x + sum(D)

# call function on worker
@time fetch(@spawnat 2 f(1.0))

The last command gives me the following error:

 ERROR: On worker 2:
 UndefVarError: D not defined
 in anonymous at none:1
 in anonymous at multi.jl:1358
 in anonymous at multi.jl:904
 in run_work_thunk at multi.jl:645
 in run_work_thunk at multi.jl:654
 in anonymous at task.jl:58
 in remotecall_fetch at multi.jl:731
 in call_on_owner at multi.jl:777
 in fetch at multi.jl:795

I thought that the Shared Array D should be visible to all workers? I am clearly missing something basic. Thanks in advance.

like image 452
user1438310 Avatar asked Mar 02 '16 15:03

user1438310


People also ask

What are shared arrays in Julia?

Shared arrays, as the name suggests, are arrays that are shared across multiple Julia processes on the same machine. Once a shared array is created, it is accessible in full to all the specified workers (on the same machine).

Are arrays mutable in Julia?

In Julia, arrays are actually mutable type collections which are used for lists, vectors, tables, and matrices. That is why the values of arrays in Julia can be modified with the use of certain pre-defined keywords.

How do you create an array in Julia?

An Array in Julia can be created with the use of a pre-defined keyword Array() or by simply writing array elements within square brackets([]).


2 Answers

Although the underlying data is shared to all workers, the declaration of D is not. You will still need to pass in the reference to D, so something like

f = (x,SA) -> x + sum(SA) @time fetch(@spawnat 2 f(1.0,D))

should work. You can change D on the main process and see that it is infact using the same data:

julia> # call function on worker
       @time fetch(@spawnat 2 f(1.0,D))
  0.325254 seconds (225.62 k allocations: 9.701 MB, 5.88% gc time)
4.405613684678047

julia> D[1] += 1
1.2005544517241717

julia> # call function on worker
       @time fetch(@spawnat 2 f(1.0,D))
  0.004548 seconds (637 allocations: 45.490 KB)
5.405613684678047
like image 94
Daniel Arndt Avatar answered Oct 25 '22 14:10

Daniel Arndt


This works, without declaring D, through a closure within a function.

function dothis()
    D = SharedArray{Float64}(2, 3)

    # initialise the data matrix with dummy values
    for ii=1:length(D)
       D[ii] = ii #not rand() anymore
    end

    # Define some kind of dummy computation involving the shared array 
    f = x -> x + sum(D)

    # call function on worker
    @time fetch(@spawnat 2 f(1.0))
end

julia> dothis()
1.507047 seconds (206.04 k allocations: 11.071 MiB, 0.72% gc time)
22.0
julia> dothis()
0.012596 seconds (363 allocations: 19.527 KiB)
22.0

So though I have answered the OP's question, and the SharedArray is visible to all workers -- is this legitimate?

like image 1
branden_burger Avatar answered Oct 25 '22 12:10

branden_burger