Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does a subtype of AbstractArray result in imprecise matrix operations in Julia?

I'm currently working on creating a subtype of AbstractArray in Julia, which allows you to store a vector in addition to an Array itself. You can think of it as the column "names", with element types as a subtype of AbstractFloat. Hence, it has some similarities to the NamedArray.jl package, but restricts to only assigning the columns with Floats (in case of matrices).

The struct that I've created so far (following the guide to create a subtype of AbstractArray) is defined as follows:

struct FooArray{T, N, AT, VT} <: AbstractArray{T, N}
    data::AT
    vec::VT
    function FooArray(data::AbstractArray{T1, N}, vec::AbstractVector{T2}) where {T1 <: AbstractFloat, T2 <: AbstractFloat, N}
        length(vec) == size(data, 2) || error("Inconsistent dimensions")
        new{T1, N, typeof(data), typeof(vec)}(data, vec)
    end
end
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, i::Int) = getindex(fooarr.data, i)
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, I::Vararg{Int, 2}) = getindex(fooarr.data, I...)
@inline Base.@propagate_inbounds Base.size(fooarr::FooArray) = size(fooarr.data)
Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()

This already seems to be enough to create objects of type fooArray and do some simple math with it. However, I've observed that some essential functions such as matrix-vector multiplications seem to be imprecise. For example, the following should consistently return a vector of 0.0, but:

R = rand(100, 3)
S = FooArray(R, collect(1.0:3.0))
y = rand(100)
S'y - R'y
3-element Vector{Float64}:
 -7.105427357601002e-15
 0.0
 3.552713678800501e-15

While the differences are very small, they can quickly add up over many different calculations, leading to significant errors.

Where do these differences come from?

A look at the calculations via macro @code_llvm reveals that appearently different matmul functions from LinearAlgebra are used (with other minor differences):

@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:111 within `*'
   ...
@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:106 within `*'
   ...

Redefining the adjoint and * functions on our FooArray object provides the expected, correct result:

import Base: *, adjoint, /
Base.adjoint(a::FooArray) = FooArray(a.data', zeros(size(a.data, 1)))
*(a::FooArray{T, 2, AT, VT} where {AT, VT}, b::AbstractVector{S}) where {T, S} = a.data * b
S'y - R'y
3-element Vector{Float64}:
 0.0
 0.0
 0.0

However, this solution (which is also done in NamedArrays here) would require defining and maintaining all sorts of functions, not just the standard functions in base, adding more and more dependencies just because of this small error margin.

Is there any simpler way to get rid of this issue without redefining every operation and possibly many other functions from other packages?

I'm using Julia version 1.6.1 on Windows 64-bit system.

like image 967
Claudio Avatar asked Nov 09 '21 13:11

Claudio


People also ask

How does Julia define the matrix?

Julia provides a very simple notation to create matrices. A matrix can be created using the following notation: A = [1 2 3; 4 5 6]. Spaces separate entries in a row and semicolons separate rows. We can also get the size of a matrix using size(A).

How do you create an array of zeros in Julia?

This is usually called with the syntax Type[] . Element values can be specified using Type[a,b,c,...] . zeros([T=Float64,] dims::Tuple) zeros([T=Float64,] dims...) Create an Array , with element type T , of all zeros with size specified by dims .

How does Julia define vector?

A Vector in Julia can be created with the use of a pre-defined keyword Vector() or by simply writing Vector elements within square brackets([]). There are different ways of creating Vector.

How do you fill an array in Julia?

Create an array filled with the value x . For example, fill(1.0, (10,10)) returns a 10x10 array of floats, with each element initialized to 1.0 . If x is an object reference, all elements will refer to the same object. fill(Foo(), dims) will return an array filled with the result of evaluating Foo() once.

What is the array implementation in Julia?

Julia, like most technical computing languages, provides a first-class array implementation. Most technical computing languages pay a lot of attention to their array implementation at the expense of other containers. Julia does not treat arrays in any special way.

What is an abstractarray?

The AbstractArray type includes anything vaguely array-like, and implementations of it might be quite different from conventional arrays. For example, elements might be computed on request rather than stored.

How does Julia decide the data type of the matrix?

Julia automatically decides the data type of the matrix by analyzing the values assigned to it. Because of the ordered nature of a matrix, it makes it easier to perform operations on its values based on their index. Following are some common matrix manipulation operations in Julia:

How to broadcast a single value in an array in Julia?

By placing it inside another container (like a single element Tuple) broadcast will treat it as a single value. The base array type in Julia is the abstract type AbstractArray {T,N}. It is parameterized by the number of dimensions N and the element type T. AbstractVector and AbstractMatrix are aliases for the 1-d and 2-d cases.


1 Answers

Yes, the implementation of matrix multiplication will vary depending upon your array type. The builtin Array will use BLAS, whereas your custom fooArray will use a generic implementation, and due to the non-associativity of floating point arithmetic, these different approaches will indeed yield different values — and note that they may be different from the ground truth, even for the builtin Arrays!

julia> using Random; Random.seed!(0); R = rand(100, 3); y = rand(100);

julia> R'y - Float64.(big.(R)'big.(y))
3-element Vector{Float64}:
 -3.552713678800501e-15
  0.0
  0.0

You may be able to implement your custom array as a DenseArray, which will ensure that it uses the same (BLAS-enabled) codepath. You just need to implement a few more methods, most importantly strides and unsafe_convert:

julia> struct FooArray{T, N} <: DenseArray{T, N}
          data::Array{T, N}
       end
       Base.getindex(fooarr::FooArray, i::Int) = fooarr.data[i]
       Base.size(fooarr::FooArray) = size(fooarr.data)
       Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()
       Base.strides(fooarr::FooArray) = strides(fooarr.data)
       Base.unsafe_convert(P::Type{Ptr{T}}, fooarr::FooArray{T}) where {T} = Base.unsafe_convert(P, fooarr.data)

julia> R = rand(100, 3); S = FooArray(R); y = rand(100)
       R'y - S'y
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> R = rand(100, 1000); S = FooArray(R); y = rand(100)
       R'y == S'y
true
like image 149
mbauman Avatar answered Oct 17 '22 12:10

mbauman