Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Double the allocations reported by @time for large vectors in Julia

Consider the following simple program in Julia:

function foo_time(x)
    @time x.^2
    return nothing
end
n = 1000;
foo_time(collect(1:n));

If I run that in my console, then @time reports 1 allocation, which is what I expect. However, if I change n to 10000, then @time reports 2 allocations.

What is more, if I chain together functions without syntactic loop fusion (in other words, without dots) then I seem to get double the expected allocations. For example, writing (x + x).^2 + x instead of x.^2 yields 3 allocations with n = 1000, but it yields 6 allocations with n = 10000. (The pattern does not strictly continue though: for instance, (x + x + x).^2 only yields 5 allocations for n = 10000.)

Why should the size of the vector affect how many allocations occur? What is going on under the hood here?

This occurs both in the JupyterLab console and in the normal Julia REPL.

like image 578
Grayscale Avatar asked Mar 03 '23 07:03

Grayscale


1 Answers

Why is there one allocation with small vectors and two allocations with big vectors?

Really, this doesn't matter and is an internal detail for how arrays work. Essentially there are two parts of a Julia Array: the internal header (which keeps track of the array's dimensionality and element type and such), and the data itself. When the arrays are small, there's an advantage in bundling these two data segments together, but when the arrays are big, there's an advantage in keeping them separate. This isn't a broadcasting thing, it's just an Array allocation thing:

julia> f(n) = (@time Vector{Int}(undef, n); nothing)
f (generic function with 1 method)

julia> f(2048)
  0.000003 seconds (1 allocation: 16.125 KiB)

julia> f(2049)
  0.000003 seconds (2 allocations: 16.141 KiB)

Then hopefully you can see why this leads to double the number of allocations for large arrays when there are temporaries involved — there's one for each array's header and one for each array's data.

In short — don't worry too much about the number of allocations. There are times when allocations can actually improve performance. When to be concerned, however, is when you see a huge number of allocations — especially if you can see that they're proportional to the number of elements in the array.

like image 195
mbauman Avatar answered Apr 30 '23 17:04

mbauman