Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dropping singleton dimensions in julia

Tags:

julia

Just playing around with Julia (1.0) and one thing that I need to use a lot in Python/numpy/matlab is the squeeze function to drop the singleton dimensions.

I found out that one way to do this in Julia is:

a = rand(3, 3, 1);
a = dropdims(a, dims = tuple(findall(size(a) .== 1)...))

The second line seems a bit cumbersome and not easy to read and parse instantly (this could also be my bias that I bring from other languages). However, I wonder if this is the canonical way to do this in Julia?

like image 348
Luca Avatar asked Sep 25 '18 19:09

Luca


3 Answers

The actual answer to this question surprised me. What you are asking could be rephrased as:

why doesn't dropdims(a) remove all singleton dimensions?

I'm going to quote Tim Holy from the relevant issue here:

it's not possible to have squeeze(A) return a type that the compiler can infer---the sizes of the input matrix are a runtime variable, so there's no way for the compiler to know how many dimensions the output will have. So it can't possibly give you the type stability you seek.

Type stability aside, there are also some other surprising implications of what you have written. For example, note that:

julia> f(a) = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
f (generic function with 1 method)

julia> f(rand(1,1,1))
0-dimensional Array{Float64,0}:
0.9939103383167442

In summary, including such a method in Base Julia would encourage users to use it, resulting in potentially type-unstable code that, under some circumstances, will not be fast (something the core developers are strenuously trying to avoid). In languages like Python, rigorous type-stability is not enforced, and so you will find such functions.

Of course, nothing stops you from defining your own method as you have. And I don't think you'll find a significantly simpler way of writing it. For example, the proposition for Base that was not implemented was the method:

function squeeze(A::AbstractArray)
    singleton_dims = tuple((d for d in 1:ndims(A) if size(A, d) == 1)...)
    return squeeze(A, singleton_dims)
end

Just be aware of the potential implications of using it.

like image 137
Colin T Bowers Avatar answered Oct 13 '22 15:10

Colin T Bowers


Let me simply add that "uncontrolled" dropdims (drop any singleton dimension) is a frequent source of bugs. For example, suppose you have some loop that asks for a data array A from some external source, and you run R = sum(A, dims=2) on it and then get rid of all singleton dimensions. But then suppose that one time out of 10000, your external source returns A for which size(A, 1) happens to be 1: boom, suddenly you're dropping more dimensions than you intended and perhaps at risk for grossly misinterpreting your data.

If you specify those dimensions manually instead (e.g., dropdims(R, dims=2)) then you are immune from bugs like these.

like image 35
tholy Avatar answered Oct 13 '22 16:10

tholy


You can get rid of tuple in favor of a comma ,:

dropdims(a, dims = (findall(size(a) .== 1)...,))
like image 3
Przemyslaw Szufel Avatar answered Oct 13 '22 16:10

Przemyslaw Szufel