I am trying to use spectral clustering on an image. I first calculate the affinity matrix, then attempt to get the eigenvectors. However, on a 7056x7056 matrix eig() call is taking too long. Any suggestions on how to improve this? Perhaps I should use a different form of affinity?
import matplotlib.pyplot as plt
import numpy as np
Img = plt.imread("twoObj.bmp")
Img2 = Img.flatten()
(n,) = Img2.shape
A = np.subtract.outer(Img2, Img2)
V,D = np.linalg.eig(A)
One quick and easy optimzation is to use np.linalg.eigh
. (And np.linalg.eigvalsh
if you just want the eigenvalues.)
Because you have a symmetric matrix (assuming you take the absolute value), you can "tell" numpy to use a more efficient algorithm this way.
import numpy as np
x = np.random.random(1000)
A = np.subtract.outer(x, x)
A = np.abs(A)
w, v = np.linalg.eigh(A)
Comparing timings, eigh
takes ~5.3 seconds while eig
takes ~23.4 seconds.
The performance of np.linalg.eig
, etc is going to be strongly dependent on which libraries numpy is linked to. Using a heavily optimized blas library (e.g. ATLAS or Intel's MKL) can have a very dramatic difference, especially in this case.
Also, depending on how numpy is built, (e.g. whether or not a fortran compiler was availabe) scipy.linalg.eigh
etc may be faster. There's also a chance that scipy and numpy may be linked against different blas libraries, though this is rather unlikely.
First of all, based on how you constructed your matrix A
. It will be an anti-symmetric (aka skew-symmetric) matrix, and it's rank will be (very likely) 2.
Perhaps you should take only the eigenvectors corresponding to the two largest eigenvalues. However it's likely that the eigenvalues are complex.
Anyway it might be that working with svd
(singular value decomposition) will actually be more straightforward one.
Please feel free to elaborate more on what you are aiming for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With