Eigenvalue problems in TensorFlow

Question

I want to solve an eigenvalue problem using TensorFlow. In particular, I have

e, v = tf.self_adjoint_eig(laplacian, name="eigendata")
eigenmap = v[:,1:4]

so I don't want to compute all eigenvectors.

In Matlab, I would use eigs(laplacian,4,'sm')

Looking at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/linalg_ops.py, I see that tf.self_adjoint_eig calls gen_linalg_ops._self_adjoint_eig_v2. However, I can't find gen_linalg_ops on Github or elsewhere.

Any advice on doing such linear algebra in TensorFlow, or is it best to go with other libraries in Python?

user10266593 · Accepted Answer

MATLAB function EIG calculates all the eigenvectors. MATLAB function EIGS only calculates a selected number of the eigenvectors using precompiled https://en.wikipedia.org/wiki/ARPACK which implements https://en.wikipedia.org/wiki/Lanczos_algorithm There is no native MATLAB Lanczos code in MATLAB, most likely because the Lanczos algorithm is unavoidably unstable with respect to round-off errors, especially in single precision, making more stable implementations tricky and/or expensive.

An alternative to EIGS function is https://www.mathworks.com/matlabcentral/fileexchange/48-lobpcg-m that implements https://en.wikipedia.org/wiki/LOBPCG natively in MATLAB.

SciPy has an interface to ARPACK as well as the Python native implementation https://docs.scipy.org/doc/scipy-1.1.0/reference/generated/scipy.sparse.linalg.lobpcg.html

Scikit uses ARPACK or LOBPCG for manifold spectral embedding http://scikit-learn.org/stable/modules/generated/sklearn.manifold.spectral_embedding.html and for spectral clustering http://scikit-learn.org/stable/modules/generated/sklearn.cluster.SpectralClustering.html

TensorFlow now has a native implementation of Lanczos https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/solvers/python/ops/lanczos.py

Yaroslav Bulatov · Answer

Current TensorFlow linalg implementations are single core and seem to be getting written from scratch, so it may take some time to match functionality of older libraries like Intel's Matrix Kernel Library.

You could do computation in MKL (available with conda version of scipy) and pass the values data between MKL and TensorFlow as numpy arrays. Since computation scales as O(n^3) for large enough matrices, the extra cost of data transfer will be negligible compared to computation.

For instance computing full SVD in MKL (which counter-intuitively, is faster than self-adjoint eig), I use following wrapper, which keeps in tf.Variable objects, and lets me switch between MKL and TensorFlow implementations

default_dtype = tf.float32
USE_MKL_SVD=True                   # Tensorflow vs MKL SVD

if USE_MKL_SVD:
  assert np.__config__.get_info("lapack_mkl_info"), "No MKL detected :("


class SvdWrapper:
  """Encapsulates variables needed to perform SVD of a TensorFlow target.
  Initialize: wrapper = SvdWrapper(tensorflow_var)
  Trigger SVD: wrapper.update_tf() or wrapper.update_scipy()
  Access result as TF vars: wrapper.s, wrapper.u, wrapper.v
  """

  def __init__(self, target, name):
    self.name = name
    self.target = target
    self.tf_svd = SvdTuple(tf.svd(target))

    self.init = SvdTuple(
      ones(target.shape[0], name=name+"_s_init"),
      Identity(target.shape[0], name=name+"_u_init"),
      Identity(target.shape[0], name=name+"_v_init")
    )

    assert self.tf_svd.s.shape == self.init.s.shape
    assert self.tf_svd.u.shape == self.init.u.shape
    assert self.tf_svd.v.shape == self.init.v.shape

    self.cached = SvdTuple(
      tf.Variable(self.init.s, name=name+"_s"),
      tf.Variable(self.init.u, name=name+"_u"),
      tf.Variable(self.init.v, name=name+"_v")
    )

    self.s = self.cached.s
    self.u = self.cached.u
    self.v = self.cached.v

    self.holder = SvdTuple(
      tf.placeholder(default_dtype, shape=self.cached.s.shape, name=name+"_s_holder"),
      tf.placeholder(default_dtype, shape=self.cached.u.shape, name=name+"_u_holder"),
      tf.placeholder(default_dtype, shape=self.cached.v.shape, name=name+"_v_holder")
    )

    self.update_tf_op = tf.group(
      self.cached.s.assign(self.tf_svd.s),
      self.cached.u.assign(self.tf_svd.u),
      self.cached.v.assign(self.tf_svd.v)
    )

    self.update_external_op = tf.group(
      self.cached.s.assign(self.holder.s),
      self.cached.u.assign(self.holder.u),
      self.cached.v.assign(self.holder.v)
    )

    self.init_ops = (self.s.initializer, self.u.initializer, self.v.initializer)


  def update(self):
    if USE_MKL_SVD:
      self.update_scipy()
    else:
      self.update_tf()

  def update_tf(self):
    sess = tf.get_default_session()
    sess.run(self.update_tf_op)

  def update_scipy(self):
    sess = tf.get_default_session()
    target0 = self.target.eval()
    # A=u.diag(s).v', singular vectors are columns
    # TODO: catch "ValueError: array must not contain infs or NaNs"
    u0, s0, vt0 = linalg.svd(target0)
    v0 = vt0.T
    #    v0 = vt0 # bug, makes loss increase, use for sanity checks
    feed_dict = {self.holder.u: u0,
                 self.holder.v: v0,
                 self.holder.s: s0}
    sess.run(self.update_external_op, feed_dict=feed_dict)

And to use

covariance = data @ t(data)
svd = u.SvdWrapper(target=covariance)
sess.run(svd.init_ops)   # initialize to identity matrices
svd.update()             # update using latest value of covariance 
sess.run([svd.s, svd.u, svd.v])  # get values of factors

Eigenvalue problems in TensorFlow

Tags:

python

tensorflow

eigenvector

batesbatesbates

2 Answers

user10266593

Yaroslav Bulatov

Recent Activity

Donate For Us

Eigenvalue problems in TensorFlow

Tags:

python

tensorflow

eigenvector

batesbatesbates

2 Answers

user10266593

Yaroslav Bulatov

Related questions

Recent Activity

Donate For Us