Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy dot too clever about symmetric multiplications

Tags:

Anybody know about documentation for this behaviour?

import numpy as np
A  = np.random.uniform(0,1,(10,5))
w  = np.ones(5)
Aw = A*w
Sym1 = Aw.dot(Aw.T)
Sym2 = (A*w).dot((A*w).T)
diff = Sym1 - Sym2

diff.max() is near machine-precision non-zero, e.g. 4.4e-16.

This (the discrepancy from 0) is usually fine... in a finite-precision world we should not be surprised.

Moreover, I would guess that numpy is being smart about symmetric products, to save flops and ensure symmetric output...

But I deal with chaotic systems, and this small discrepancy quickly becomes noticeable when debugging. So I'd like to know exactly what's going on.

like image 808
Patrick Avatar asked Apr 17 '17 14:04

Patrick


2 Answers

This behaviour is the result of a change introduced for NumPy 1.11.0, in pull request #6932. From the release notes for 1.11.0:

Previously, gemm BLAS operations were used for all matrix products. Now, if the matrix product is between a matrix and its transpose, it will use syrk BLAS operations for a performance boost. This optimization has been extended to @, numpy.dot, numpy.inner, and numpy.matmul.

In the changes for that PR, one finds this comment:

/*
 * Use syrk if we have a case of a matrix times its transpose.
 * Otherwise, use gemm for all other cases.
 */

So NumPy is making an explicit check for the case of a matrix times its transpose, and calling a different underlying BLAS function in that case. As @hpaulj notes in a comment, such a check is cheap for NumPy, since a transposed 2d array is simply a view on the original array, with inverted shape and strides, so it suffices to check a few pieces of metadata on the arrays (rather than having to compare the actual array data).

Here's a slightly simpler case that shows the discrepancy. Note that using a .copy on one of the arguments to dot is enough to defeat NumPy's special-casing.

import numpy as np
random = np.random.RandomState(12345)
A = random.uniform(size=(10, 5))
Sym1 = A.dot(A.T)
Sym2 = A.dot(A.T.copy())
print(abs(Sym1 - Sym2).max())

I guess one advantage of this special-casing, beyond the obvious potential for speed-up, is that you're guaranteed (I'd hope, but in practice it'll depend on the BLAS implementation) to get a perfectly symmetric result when syrk is used, rather than a matrix which is merely symmetric up to numerical error. As an (admittedly not very good) test for this, I tried:

import numpy as np
random = np.random.RandomState(12345)
A = random.uniform(size=(100, 50))
Sym1 = A.dot(A.T)
Sym2 = A.dot(A.T.copy())
print("Sym1 symmetric: ", (Sym1 == Sym1.T).all())
print("Sym2 symmetric: ", (Sym2 == Sym2.T).all())

Results on my machine:

Sym1 symmetric:  True
Sym2 symmetric:  False
like image 151
Mark Dickinson Avatar answered Oct 20 '22 15:10

Mark Dickinson


I suspect this is to do with promotion of intermediate floating point registers to 80 bit precision. Somewhat confirming this hypothesis is that if we use fewer floats we consistently get 0 in our results, ala

A  = np.random.uniform(0,1,(4,2))
w  = np.ones(2)
Aw = A*w
Sym1 = Aw.dot(Aw.T)
Sym2 = (A*w).dot((A*w).T)
diff = Sym1 - Sym2
# diff is all 0's (ymmv)
like image 23
U2EF1 Avatar answered Oct 20 '22 15:10

U2EF1