I noticed, that in Julia and Python the result of matrix-matrix multiplication is different for sparse and dense arrays, if infinity is involved, see sample code:
julia> using SparseArrays
julia> using LinearAlgebra
julia> A = spdiagm(0 => [0, 1])
2×2 SparseMatrixCSC{Int64,Int64} with 2 stored entries:
[1, 1] = 0
[2, 2] = 1
julia> B = [1 Inf; 1 2]
2×2 Array{Float64,2}:
1.0 Inf
1.0 2.0
julia> A * B
2×2 Array{Float64,2}:
0.0 NaN
1.0 2.0
julia> Array(A) * B
2×2 Array{Float64,2}:
0.0 NaN
1.0 NaN
julia> dropzeros(A) * B
2×2 Array{Float64,2}:
0.0 0.0
1.0 2.0
same in Python
from scipy.sparse import diags
import numpy as np
A = diags([0, 1])
B = np.array([[1, np.inf], [1, 2]])
print(f"A=\n{A}")
print(f"B=\n{B}")
print(f"sparse mul:\n{A @ B}")
print(f"dense mul:\n{A.toarray() @ B}")
prints out
A=
(1, 1) 1.0
B=
[[ 1. inf]
[ 1. 2.]]
sparse mul:
[[0. 0.]
[1. 2.]]
/home/.../TestSparseInf.py:9: RuntimeWarning: invalid value encountered in matmul
print(f"dense mul:\n{A.toarray() @ B}")
dense mul:
[[ 0. nan]
[ 1. nan]]
maybe this is due to the same underling subroutine, haven't checked this with other languages so far.
It looks like the product with some unstored entry is always set to zero and therefore no NaN
is produced as in case of 0 * inf
, which comes up in dense arrays.
I haven't found any documentation mentioning this behavior. Does anyboy know if this is common or somewhere agreed upon?
Especially in Julia I would expect, from a mathematical point of view, that dropzeros
does not alter the result, which is not the case here.
scipy
on the other hand drops zeros automatically, therefore I found no way the reproduce the result of the first Julia multiplication (A * B).
A matrix that has been compressed to eliminate zero-values is a sparse matrix, whereas a matrix with zero and nonzero values is a dense matrix.
Sparse dimensions lack data values for the majority of member combinations. Dense dimensions have data values for the majority of member combinations. At least one dense dimension is required.
Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeros. Sparse data is by nature more easily compressed and thus requires significantly less storage.
If most of the elements of the matrix have 0 value, then it is called a sparse matrix. Why to use Sparse Matrix instead of simple matrix ? Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.
The TLDR is that sparse matrices are a massive performance win because you don't have to check what 0*x
is. If 99.9% of your entries are zeroes (this is often the case), then checking for inf
values in the other matrix is doing a lot of extra work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With