Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array subsetting in Julia

Tags:

arrays

julia

With the Julia Language, I defined a function to sample points uniformly inside the sphere of radius 3.14 using rejection sampling as follows:

function spherical_sample(N::Int64)

    # generate N points uniformly distributed inside sphere 
    # using rejection sampling:

    points = pi*(2*rand(5*N,3).-1.0)

    ind = sum(points.^2,dims=2) .<= pi^2

    ## ideally I wouldn't have to do this: 
    ind_ = dropdims(ind,dims=2)

    return points[ind_,:][1:N,:]

end

I found a hack for subsetting arrays:

ind = sum(points.^2,dims=2) .<= pi^2

## ideally I wouldn't have to do this: 
ind_ = dropdims(ind,dims=2)

But, in principle array indexing should be a one-liner. How could I do this better in Julia?

like image 630
Aidan Rocke Avatar asked Feb 07 '20 16:02

Aidan Rocke


4 Answers

The problem is that you are creating a 2-dimensional index vector. You can avoid it by using eachrow:

ind = sum.(eachrow(points.^2)) .<= pi^2

So that your full answer would be:

function spherical_sample(N::Int64)
    points = pi*(2*rand(5*N,3).-1.0)
    ind = sum.(eachrow(points.^2)) .<= pi^2
    return points[ind,:][1:N,:]
end
like image 139
David Varela Avatar answered Oct 04 '22 22:10

David Varela


Here is a one-liner:

points[(sum(points.^2,dims=2) .<= pi^2)[:],:][1:N, :]

Note that [:] is dropping a dimension so the BitArray can be used for indexing.

like image 28
Przemyslaw Szufel Avatar answered Oct 04 '22 21:10

Przemyslaw Szufel


This does not answer your question directly (as you already got two suggestions), but I rather thought to hint how you could implement the whole procedure differently if you want it to be efficient.

The first point is to avoid generating 5*N rows of data - the problem is that it is very likely that it will be not enough to generate N valid samples. The point is that the probability of a valid sample in your model is ~50%, so it is possible that there will not be enough points to choose from and [1:N, :] selection will throw an error.

Below is the code I would use that avoids this problem:

function spherical_sample(N::Integer) # no need to require Int64 only here
    points = 2 .* pi .* rand(N, 3) .- 1.0 # note that all operations are vectorized to avoid excessive allocations
    while N > 0 # we will run the code until we have N valid rows
        v = @view points[N, :] # use view to avoid allocating
        if sum(x -> x^2, v) <= pi^2 # sum accepts a transformation function as a first argument
            N -= 1 # row is valid - move to the previous one
        else
            rand!(v) # row is invalid - resample it in place
            @. v = 2 * pi * v - 1.0 # again - do the computation in place via broadcasting
        end
    end
    return points
end
like image 43
Bogumił Kamiński Avatar answered Oct 04 '22 23:10

Bogumił Kamiński


This one is pretty fast, and uses StaticArrays. You can probably also implement something similar with ordinary tuples:

using StaticArrays

function sphsample(N)
    T = SVector{3, Float64}
    v = Vector{T}(undef, N)
    n = 1
    while n <= N
        p = rand(T) .- 0.5
        @inbounds v[n] = p .* 2π
        n += (sum(abs2, p) <= 0.25)
    end
    return v
end

On my laptop it is ~9x faster than the solution with views.

like image 40
DNF Avatar answered Oct 04 '22 23:10

DNF