Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does matrix multiplication with sparse matrices differ from dense ones if Inf is involved?

I noticed, that in Julia and Python the result of matrix-matrix multiplication is different for sparse and dense arrays, if infinity is involved, see sample code:

julia> using SparseArrays

julia> using LinearAlgebra

julia> A = spdiagm(0 => [0, 1])
2×2 SparseMatrixCSC{Int64,Int64} with 2 stored entries:
  [1, 1]  =  0
  [2, 2]  =  1

julia> B = [1 Inf; 1 2]
2×2 Array{Float64,2}:
 1.0  Inf
 1.0   2.0

julia> A * B
2×2 Array{Float64,2}:
 0.0  NaN
 1.0    2.0

julia> Array(A) * B
2×2 Array{Float64,2}:
 0.0  NaN
 1.0  NaN

julia> dropzeros(A) * B
2×2 Array{Float64,2}:
 0.0  0.0
 1.0  2.0

same in Python

from scipy.sparse import diags
import numpy as np

A = diags([0, 1])
B = np.array([[1, np.inf], [1, 2]])
print(f"A=\n{A}")
print(f"B=\n{B}")
print(f"sparse mul:\n{A @ B}")
print(f"dense mul:\n{A.toarray() @ B}")

prints out

A=
  (1, 1)    1.0
B=
[[ 1. inf]
 [ 1.  2.]]
sparse mul:
[[0. 0.]
 [1. 2.]]
/home/.../TestSparseInf.py:9: RuntimeWarning: invalid value encountered in matmul
  print(f"dense mul:\n{A.toarray() @ B}")
dense mul:
[[ 0. nan]
 [ 1. nan]]

maybe this is due to the same underling subroutine, haven't checked this with other languages so far. It looks like the product with some unstored entry is always set to zero and therefore no NaN is produced as in case of 0 * inf, which comes up in dense arrays.

I haven't found any documentation mentioning this behavior. Does anyboy know if this is common or somewhere agreed upon? Especially in Julia I would expect, from a mathematical point of view, that dropzeros does not alter the result, which is not the case here. scipy on the other hand drops zeros automatically, therefore I found no way the reproduce the result of the first Julia multiplication (A * B).

like image 687
Hugou Avatar asked Aug 31 '20 13:08

Hugou


People also ask

What is the difference between sparse matrix and dense matrix?

A matrix that has been compressed to eliminate zero-values is a sparse matrix, whereas a matrix with zero and nonzero values is a dense matrix.

What's the difference between sparse and dense?

Sparse dimensions lack data values for the majority of member combinations. Dense dimensions have data values for the majority of member combinations. At least one dense dimension is required.

Why is sparse matrix important than dense matrix?

Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeros. Sparse data is by nature more easily compressed and thus requires significantly less storage.

What is the sparse matrix Why do we need to use a sparse matrix instead of a simple matrix What are the various ways of representing the sparse matrix in the system?

If most of the elements of the matrix have 0 value, then it is called a sparse matrix. Why to use Sparse Matrix instead of simple matrix ? Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.


1 Answers

The TLDR is that sparse matrices are a massive performance win because you don't have to check what 0*x is. If 99.9% of your entries are zeroes (this is often the case), then checking for inf values in the other matrix is doing a lot of extra work.

like image 136
Oscar Smith Avatar answered Sep 18 '22 16:09

Oscar Smith