Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing a large but low-rank matrix efficiently

Tags:

r

matrix

rank

cran

I know there are packages in R to store sparse matrices efficiently. Is there also a way to store a low-rank matrix efficiently? For example:

A <- matrix(rnorm(1e6), nrow=1e5, ncol=1e1)
B <- A %*% t(A)

Now, B is too large to store in memory, but it is low in rank. Is there any way to construct and store B in an efficient way, such that some basic read methods (rowSums, colSums, etc) are performed on the fly, in order to trade for cpu or memory?

like image 622
Jeroen Ooms Avatar asked Mar 14 '13 00:03

Jeroen Ooms


People also ask

What is low rank optimization?

In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank.

What is meant by low rank matrix?

Upshot: A matrix is low-rank if it has many fewer linearly independent columns than columns. Such matrices can be efficiently represented using rank-factorizations, which can be used to perform various computations rapidly.

Why do we need low rank approximation?

Low-rank approximation is thus a way to recover the "original" (the "ideal" matrix before it was messed up by noise etc.) low-rank matrix i.e., find the matrix that is most consistent (in terms of observed entries) with the current matrix and is low-rank so that it can be used as an approximation to the ideal matrix.

Does matrix factorization generates a low rank approximation of a matrix?

The algorithm for computing the QR factorization can trivially be modified to compute a low rank approximation to a matrix. Consider the algorithm given in Figure 2. After k steps of the algorithm, we find that the matrices Q and R hold the following entries: R = [ Rk 0 ] and Q = [Qk ˜ Ak].


1 Answers

Your question is already the answer: To store such a low rank matrix efficiently, you make a data structure containing both factors. If matrix-vector multiplication is required, it can be done from right to left using matrix-vector products of the factors.

One application of this strategy and data structure can be found in implementations of limited-memory Broyden or BFGS quasi-Newton methods.

like image 176
Lutz Lehmann Avatar answered Sep 23 '22 01:09

Lutz Lehmann