Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia: Converting Vector of Arrays to Array for Arbitrary Dimensions

Using timing tests, I found that it's much more performant to grow Vector{Array{Float64}} objects using push! than it is to simply use an Array{Float64} object and either hcat or vcat. However, after the computation is completed, I need to change the resulting object to an Array{Float64} for further analysis. Is there a way that works regardless of the dimensions? For example, if I generate the Vector of Arrays via

u =  [1 2 3 4
      1 3 3 4
      1 5 6 3
      5 2 3 1]
uFull = Vector{Array{Int}}(0)
push!(uFull,u)
for i = 1:10000
  push!(uFull,u)
end

I can do the conversion like this:

fill = Array{Int}(size(uFull)...,size(u)...)
for i in eachindex(uFull)
  fill[i,:,:] = uFull[i]
end

but notice this requires that I know the arrays are matrices (2-dimensional). If it's 3-dimensional, I would need another :, and so this doesn't work for arbitrary dimensions.

Note that I also need a form of the "inverse transform" (except first indexed by the last index of the full array) in arbitrary dimensions, and I currently have

filla = Vector{Array{Int}}(size(fill)[end])
  for i in 1:size(fill)[end]
filla[i] = fill[:,:,i]' 
end

I assume the method for the first conversion will likely solve the second as well.

like image 944
Chris Rackauckas Avatar asked May 27 '16 06:05

Chris Rackauckas


2 Answers

This is the sort of thing that Julia's custom array infrastructure excels at. I think the simplest solution here is to actually make a special array type that does this transformation for you:

immutable StackedArray{T,N,A} <: AbstractArray{T,N}
    data::A # A <: AbstractVector{<:AbstractArray{T,N-1}}
    dims::NTuple{N,Int}
end
function StackedArray(vec::AbstractVector)
    @assert all(size(vec[1]) == size(v) for v in vec)
    StackedArray(vec, (length(vec), size(vec[1])...))
end
StackedArray{T, N}(vec::AbstractVector{T}, dims::NTuple{N}) = StackedArray{eltype(T),N,typeof(vec)}(vec, dims)
Base.size(S::StackedArray) = S.dims
@inline function Base.getindex{T,N}(S::StackedArray{T,N}, I::Vararg{Int,N})
    @boundscheck checkbounds(S, I...)
    S.data[I[1]][Base.tail(I)...]
end

Now just wrap your vector in a StackedArray and it'll behave like an N+1 dimensional array. This could be expanded and made more featureful (it could similarly support setindex! or even push!ing arrays to concatenate natively), but I think that it's sufficient to solve your problem. By simply wrapping uFull in a StackedArray you get an object that acts like an Array{T, N+1}. Make a copy, and you get exactly a dense Array{T, N+1} without ever needing to write a for loop yourself.

julia> S = StackedArray(uFull)
10001x4x4 StackedArray{Int64,3,Array{Array{Int64,2},1}}:
[:, :, 1] =
 1  1  1  5
 1  1  1  5
 1  1  1  5
…

julia> squeeze(S[1:1, :, :], 1) == u
true

julia> copy(S) # returns a dense Array{T,N}
10001x4x4 Array{Int64,3}:
[:, :, 1] =
 1  1  1  5
 1  1  1  5
…

Finally, I'll just note that there's another solution here: you could introduce the custom array type sooner, and make a GrowableArray that internally stores its elements as a linear Vector{T}, but allows pushing entire columns or arrays directly.

like image 61
mbauman Avatar answered Sep 27 '22 00:09

mbauman


Matt B.'s answer is great, because it "simulates" an array without actually having to create or store it. When you can use this solution, it's likely to be your best choice.

However, there might be circumstances where you need to create a concatenated array (e.g., if you're passing this to some C code which requires contiguous memory). In that case you can just call cat, which is generic (it can handle arbitrary dimensions).

For example:

u =  [1 2 3 4
      1 3 3 4
      1 5 6 3
      5 2 3 1]
uFull = Vector{typeof(u)}(0)
push!(uFull,u)
for i = 1:10000
  push!(uFull,u)
end
ucat = cat(ndims(eltype(uFull))+1, uFull)

I took the liberty of making one important change to your code: uFull = Vector{typeof(u)}(0) because it ensures that the objects stored in the Vector container have concrete type. Array{Int} is actually an abstract type, because you'd need to specify the dimensionality too (Array{Int,2}).

like image 45
tholy Avatar answered Sep 23 '22 00:09

tholy