Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reshape Arrays quickly

Tags:

julia

In the following code I am using the Julia Optim package for finding an optimal matrix with respect to an objective function. Unfortunately the provided optimize function only supports vectors, so I have to transform the matrix to a vector before passing it to the optimize function, and also transform it back when using it in the objective function.

function opt(A0,X)    
    I1(A) = sum(maximum(X*A,1))

    function transform(A)
      # reshape matrix to vector
      return reshape(A,prod(size(A)))
    end

    function transformback(tA)
      # reshape vector to matrix
      return reshape(tA, size(A0))
    end

    obj(tA) = -I1(transformback(tA))
    result = optimize(obj, transform(A0), method = :nelder_mead)
    return transformback(result.minimum)
end

I think Julia is allocating new space for this every time and it feels slow, so what would be a more efficient way to tackle this problem?

like image 579
axsk Avatar asked Jan 09 '23 18:01

axsk


2 Answers

So long as arrays contain elements that are considered immutable, which includes all primitives, then elements of an array are contained in 1 big contiguous blob of memory. So you can break dimension rules and simply treat a 2 dimensional array as a 1-dimensional array, which is what you want to do. So you don't need to reshape, but I don't think reshape is your problem

Arrays are column major and contiguous

Consider the following function

function enumerateArray(a)
   for i = 1:*(size(a)...)
      print(a[i])
   end
end

This function multiplies all of the dimensions of a together and then loops from 1 to that number assuming a is one dimensional.

When you define a as the following

julia> a = [ 1 2; 3 4; 5 6]
3x2 Array{Int64,2}:
 1  2  
 3  4
 5  6

The result is

julia> enumerateArray(a)
135246

This illustrates a couple of things.

  1. Yes it actually works
  2. Matrices are stored in column-major format

reshape

So, the question is why doesn't reshape use that fact? Well it does. Here's the julia source for reshape in array.c

a = (jl_array_t*)allocobj((sizeof(jl_array_t) + sizeof(void*) + ndimwords*sizeof(size_t) + 15)&-16);

So yes a new array is created, but the only the new dimension information is created, it points back to the original data which is not copied. You can verify this simply like this:

 b = reshape(a,6);

julia> size(b)
(6,)

julia> size(a)
(3,2)

julia> b[4]=100
100

julia> a
3x2 Array{Int64,2}:
 1  100
 3    4
 5    6

So setting the 4th element of b sets the (1,2) element of a.

As for overall slowness

I1(A) = sum(maximum(X*A,1))

will create a new array.

You can use a couple of macros to track this down @profile and @time. Time will additionally record the amount of memory allocated and can be put in front of any expression.

For example

julia> A = rand(1000,1000);
julia> X = rand(1000,1000);
julia> @time sum(maximum(X*A,1))
elapsed time: 0.484229671 seconds (8008640 bytes allocated)
266274.8435928134

The statistics recorded by @profile are output using Profile.print()

like image 85
waTeim Avatar answered Feb 27 '23 23:02

waTeim


Also, most methods in Optim actually allow you to supply Arrays, not just Vectors. You could generalize the nelder_mead function to do the same.

like image 30
tholy Avatar answered Feb 27 '23 22:02

tholy