Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia: A fast and elegant way to get a matrix from an array of arrays

There is an array of arrays containing more than 10,000 pairs of Float64 values. Something like this:

v = [[rand(),rand()], ..., [rand(),rand()]]

I want to get a matrix with two columns from it. It is possible to bypass all pairs with a cycle, it looks cumbersome, but gives the result in a fraction of a second:

x = Vector{Float64}()
y = Vector{Float64}()
for i = 1:length(v)
    push!(x, v[i][1])
    push!(y, v[i][2])
end
w = hcat(x,y)

The solution with permutedims(reshape(hcat(v...), (length(v[1]), length(v)))), which I found in this task, looks more elegant but completely suspends Julia, is needed to restart the session. Perhaps it was optimal six years ago, but now it is not working in the case of large arrays. Is there a solution that is both compact and fast?

like image 464
Anton Degterev Avatar asked Dec 06 '22 08:12

Anton Degterev


2 Answers

I hope this is short and efficient enough for you:

 getindex.(v, [1 2])

and if you want something simpler to digest:

[v[i][j] for i in 1:length(v), j in 1:2]

Also the hcat solution could be written as:

permutedims(reshape(reduce(hcat, v), (length(v[1]), length(v))));

and it should not hang your Julia (please confirm - it works for me).

@Antonello: to understand why this works consider a simpler example:

julia> string.(["a", "b", "c"], [1 2])
3×2 Matrix{String}:
 "a1"  "a2"
 "b1"  "b2"
 "c1"  "c2"

I am broadcasting a column Vector ["a", "b", "c"] and a 1-row Matrix [1 2]. The point is that [1 2] is a Matrix. Thus it makes broadcasting to expand both rows (forced by the vector) and columns (forced by a Matrix). For such expansion to happen it is crucial that the [1 2] matrix has exactly one row. Is this clearer now?

like image 82
Bogumił Kamiński Avatar answered Dec 27 '22 01:12

Bogumił Kamiński


Your own example is pretty close to a good solution, but does some unnecessary work, by creating two distinct vectors, and repeatedly using push!. This solution is similar, but simpler. It is not as terse as the broadcasted getindex by @BogumilKaminski, but is faster:

function mat(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for i in eachindex(v)
        M[i, 1] = v[i][1]
        M[i, 2] = v[i][2]
    end
    return M
end

You can simplify it a bit further, without losing performance, like this:

function mat_simpler(v)
    M = Matrix{eltype(eltype(v))}(undef, length(v), 2)
    for (i, x) in pairs(v)
        M[i, 1], M[i, 2] = x
    end
    return M
end
like image 45
DNF Avatar answered Dec 27 '22 01:12

DNF