Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why/how to detemine when a function overwrites a local variable in Julia?

Tags:

scope

julia

I am relatively new to Julia, and working on porting over some C functions to check the speed difference. One this I'm struggling with is the scope of variables. Specifically, sometimes a function call in Julia overwrites a local variable, and other times not. For example, here's a function to calculate a minimum spanning tree:

function mst(my_X::Array{Float64})
    n = size(my_X)[1]
    N = zeros(Int16,n,n)
    tree = []
    lv = maximum(my_X)+1
    my_X[diagind(my_X)] .=lv
    indexi = 1
    for ijk in 1:(n-1)
        tree = vcat(tree, indexi)
        m = minimum(my_X[:,tree],dims = 1)
        a = zeros(Int64, length(tree))
        print(tree)
        for k in 1:length(tree)
            a[k] = sortperm(my_X[:,tree[k]])[1,]
        end
        b = sortperm(vec(m))[1]
        indexj = tree[b]
        indexi = a[b]
        N[indexi,indexj] = 1
        N[indexj,indexi] = 1
        for j in tree
            my_X[indexi,j] = lv
            my_X[j,indexi] = lv
        end
    end
    return N
end

Now we can apply this to a distance matrix X:

julia> X
5×5 Array{Float64,2}:
 0.0   0.54  1.08  1.12  0.95
 0.54  0.0   0.84  0.67  1.05
 1.08  0.84  0.0   0.86  1.14
 1.12  0.67  0.86  0.0   1.2
 0.95  1.05  1.14  1.2   0.0

But when I do so, it overwrites all of the entries of X

julia> M = mst(X)
julia> M
5×5 Array{Int16,2}:
 0  1  0  0  1
 1  0  1  1  0
 0  1  0  0  0
 0  1  0  0  0
 1  0  0  0  0
julia> X
5×5 Array{Float64,2}:
 2.2  2.2  2.2  2.2  2.2
 2.2  2.2  2.2  2.2  2.2
 2.2  2.2  2.2  2.2  2.2
 2.2  2.2  2.2  2.2  2.2
 2.2  2.2  2.2  2.2  2.2

Of course I can override this if I explicitly put something like this in the function:

function mst(my_Z::Array{Float64})
    my_X = copy(my_Z)
     .
     .
     .

But it seems like the issue is deeper than this. For example, if I try to replicate this in a simple example I can't recreate the issue:

function add_one(my_X::Int64)
    my_X = my_X + 1
    return my_X
end
julia> Z = 1
julia> W = add_one(Z)
julia> W
2
julia> Z
1

What is going on here?? I've read and re-read the julia help docs on variable scopes and I cannot figure out what the distinction is.

like image 516
user3037237 Avatar asked May 04 '20 09:05

user3037237


People also ask

How do functions work in Julia?

Functions in Julia can be combined by composing or piping (chaining) them together. Function composition is when you combine functions together and apply the resulting composition to arguments. You use the function composition operator ( ∘ ) to compose the functions, so (f ∘ g)(args...) is the same as f(g(args...)) .

How do you declare a variable in Julia?

Variables in Julia can be declared by just writing their name. There's no need to define a datatype with it. Initializing variables can be done at the time of declaring variables. This can be done by simply assigning a value to the named variable.

Is Julia pass by reference?

Julia values are passed and assigned by reference. If a function modifies an array, the changes will be visible in the caller. In Julia, whitespace is significant, unlike C/C++, so care must be taken when adding/removing whitespace from a Julia program.

What is global in Julia?

'global' keyword in Julia is used to access a variable that is defined in the global scope. It makes the variable where it is used as its current scope and refers to the global variable of that name. Syntax: var1 = value1 loop condition statement global var1 = value2 statement end.


1 Answers

There are the following inter-related issues here:

  1. Values in Julia can be either mutable or immutable.
  2. A variable in Julia is bound to a value (which can be either immutable or mutable).
  3. Some operations can modify mutable value.

So the first point is about mutability vs immutability of values. The discussion in the Julia manual is given here. You can check if a value is mutable or not using isimmutable function.

Typical cases are the following:

  1. numbers, strings, Tuple, NamedTuple, structs are immutable
julia> isimmutable(1)
true

julia> isimmutable("sdaf")
false

julia> isimmutable((1,2,3))
true
  1. Arrays, dicts, mutable structs etc. (in general container types other than Tuple, NamedTuple and structs) are mutable:
julia> isimmutable([1,2,3])
false

julia> isimmutable(Dict(1=>2))
false

The key difference between immutable and mutable values is that mutable values can have their contents modified. Here is a simple example:

julia> x = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> x[1] = 10
10

julia> x
3-element Array{Int64,1}:
 10
  2
  3

Now let us dissect what we have seen here:

  • the assignment statement x = [1, 2, 3] binds the value (in this case a vector) to a variable x
  • the statement x[1] = 10 mutates the value (a vector) in place

Note that the same would fail for a Tuple as it is immutable:

julia> x = (1,2,3)
(1, 2, 3)

julia> x[1] = 10
ERROR: MethodError: no method matching setindex!(::Tuple{Int64,Int64,Int64}, ::Int64, ::Int64)

Now we come to a second point - binding a value to a variable name. This is typically done using a = operator if on its left hand side we see a variable name like above with x = [1,2,3] or x = (1,2,3).

Note that in particular also += (and similar) are doing rebinding, e.g.:

julia> x = [1, 2, 3]
3-element Array{Int64,1}:
 1
 2
 3

julia> y = x
3-element Array{Int64,1}:
 1
 2
 3

julia> x += [1,2,3]
3-element Array{Int64,1}:
 2
 4
 6

julia> x
3-element Array{Int64,1}:
 2
 4
 6

julia> y
3-element Array{Int64,1}:
 1
 2
 3

as in this case it is just a shorthand of x = x + [1, 2, 3], and we know that = rebinds.

In particular (as @pszufe noted in the comment) if you pass a value to a function nothing is copied. What happens here is that a variable which is in the function signature is bound to the passed value (this kind of behavior is sometimes called pass by sharing). So you have:

julia> x = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> f(y) = y
f (generic function with 1 method)

julia> f(x) === x
true

Essentially what happens is "as if" you have written y = x. The difference is that function creates a variable y in a new scope (scope of the function), while y = x would create a binding of the value that x is bound to to the variable y in the scope where statement y = x is present.

Now on the other hand things like x[1] = 10 (which is essentially a setindex! function application) or x .= [1,2,3] are in-place operations (they do not rebind a value but try to mutate the container). So this works in-place (note that in the example I combine broadcasting with += to make it in place):

julia> x = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> y = x
3-element Array{Int64,1}:
 1
 2
 3

julia> x .+= [1,2,3]
3-element Array{Int64,1}:
 2
 4
 6

julia> y
3-element Array{Int64,1}:
 2
 4
 6

but if we tried to do the same with eg. an integer, which is immutable, the operation will fail:

julia> x = 10
10

julia> x .+= 1
ERROR: MethodError: no method matching copyto!(::Int64, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Tuple{},typeof(+),Tuple{Int64,Int64}})

The same with setting index for an immutable value:

julia> x = 10
10

julia> x[] = 1
ERROR: MethodError: no method matching setindex!(::Int64, ::Int64)

Finally the third thing is which operations try to mutate the value in-place. We have noted already some of them (like setindex!: x[10] = 10 and broadcating assignment x .= [1,2,3]). In general it is not always easy to decide if calling f(x) will mutate x if f is some general function (it may or it may not mutate x if x is mutable). Therefore in Julia there is a convention to add ! at the end of names of functions that may mutate their arguments to visually signal this (it should be stressed that this is a convention only - in particular just adding ! at the end of the the name of the function has no direct influence on how it works). We have already seen this with setindex! (for which a shorthand is x[1] = 10 as discussed), but here is a different example:

julia> x = [1, 2, 3]
3-element Array{Int64,1}:
 1
 2
 3

julia> filter(==(1), x) # no ! so a new vector is created
1-element Array{Int64,1}:
 1

julia> x
3-element Array{Int64,1}:
 1
 2
 3

julia> filter!(==(1), x) # ! so x is mutated in place
1-element Array{Int64,1}:
 1

julia> x
1-element Array{Int64,1}:
 1

If you use a function (like setindex!) that mutates its argument and want to avoid mutation use copy when passing an argument to it (or deepcopy if your structure is multiply nested and potentially mutation can happen on a deeper level - but this is rare).

So in our example:

julia> x = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> y = filter!(==(1), copy(x))
1-element Array{Int64,1}:
 1

julia> y
1-element Array{Int64,1}:
 1

julia> x
3-element Array{Int64,1}:
 1
 2
 3
like image 70
Bogumił Kamiński Avatar answered Oct 22 '22 01:10

Bogumił Kamiński