Why does matrix multiplication with sparse matrices differ from dense ones if Inf is involved?

Tags:

I noticed, that in Julia and Python the result of matrix-matrix multiplication is different for sparse and dense arrays, if infinity is involved, see sample code:

julia> using SparseArrays

julia> using LinearAlgebra

julia> A = spdiagm(0 => [0, 1])
2×2 SparseMatrixCSC{Int64,Int64} with 2 stored entries:
  [1, 1]  =  0
  [2, 2]  =  1

julia> B = [1 Inf; 1 2]
2×2 Array{Float64,2}:
 1.0  Inf
 1.0   2.0

julia> A * B
2×2 Array{Float64,2}:
 0.0  NaN
 1.0    2.0

julia> Array(A) * B
2×2 Array{Float64,2}:
 0.0  NaN
 1.0  NaN

julia> dropzeros(A) * B
2×2 Array{Float64,2}:
 0.0  0.0
 1.0  2.0

same in Python

from scipy.sparse import diags
import numpy as np

A = diags([0, 1])
B = np.array([[1, np.inf], [1, 2]])
print(f"A=\n{A}")
print(f"B=\n{B}")
print(f"sparse mul:\n{A @ B}")
print(f"dense mul:\n{A.toarray() @ B}")

prints out

A=
  (1, 1)    1.0
B=
[[ 1. inf]
 [ 1.  2.]]
sparse mul:
[[0. 0.]
 [1. 2.]]
/home/.../TestSparseInf.py:9: RuntimeWarning: invalid value encountered in matmul
  print(f"dense mul:\n{A.toarray() @ B}")
dense mul:
[[ 0. nan]
 [ 1. nan]]

maybe this is due to the same underling subroutine, haven't checked this with other languages so far. It looks like the product with some unstored entry is always set to zero and therefore no NaN is produced as in case of 0 * inf, which comes up in dense arrays.

I haven't found any documentation mentioning this behavior. Does anyboy know if this is common or somewhere agreed upon? Especially in Julia I would expect, from a mathematical point of view, that dropzeros does not alter the result, which is not the case here. scipy on the other hand drops zeros automatically, therefore I found no way the reproduce the result of the first Julia multiplication (A * B).

687

asked Aug 31 '20 13:08

Hugou

1 Answers

The TLDR is that sparse matrices are a massive performance win because you don't have to check what 0*x is. If 99.9% of your entries are zeroes (this is often the case), then checking for inf values in the other matrix is doing a lot of extra work.

136

answered Sep 18 '22 16:09

Oscar Smith

Related questions
                            
                                VS Code pylint(import-error) "Unable to import" subsub-module from custom directory
                            
                                I can't import xml.dom.minidom in PyCharm. What could I Try?
                            
                                How to avoid excessive lambda functions in pandas DataFrame assign and apply method chains
                            
                                how to improve exception handling in python/django
                            
                                How to print a pandas.io.formats.style.Styler object
                            
                                Postgres ON CONFLICT DO UPDATE only non null values in python
                            
                                Testing argument using Python Click
                            
                                for loop - Not enough values to unpack (expected 3, got 2) but I am providing it with 3
                            
                                Python: pip install wheel dependencies from a folder
                            
                                Does Ansible shell module need python on target server?
                            
                                Pytorch: RuntimeError: expected dtype Float but got dtype Long
                            
                                Stopping Python container is slow - SIGTERM not passed to python process?
                            
                                Fill the area of intersection of two Circles in Pygame
                            
                                pandas series repeat n time and change column value
                            
                                How to count the number of rows containing both a value in a set of columns and another value in another column in a Pandas dataframe?
                            
                                Why do I get CUDA out of memory when running PyTorch model [with enough GPU memory]?
                            
                                Plotly: How to set a fill color between two vertical lines?
                            
                                Celery task hangs after calling .delay() in Django
                            
                                Using PyTorch with Celery
                            
                                Windows notification with button using python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does matrix multiplication with sparse matrices differ from dense ones if Inf is involved?

Tags:

python

sparse-matrix

julia

Hugou

People also ask

1 Answers

Oscar Smith

Recent Activity

Donate For Us