Is there an algorithm to multiply square matrices in-place?

Tags:

The naive algorithm for multiplying 4x4 matrices looks like this:

void matrix_mul(double out[4][4], double lhs[4][4], double rhs[4][4]) {
    for (int i = 0; i < 4; ++i) {
        for (int j = 0; j < 4; ++j) {
            out[i][j] = 0.0;
            for (int k = 0; k < 4; ++k) {
                out[i][j] += lhs[i][k] * rhs[k][j];
            }
        }
    }
}

Obviously, this algorithm gives bogus results if out == lhs or out == rhs (here == means reference equality). Is there a version that allows one or both of those cases that doesn't simply copy the matrix? I'm happy to have different functions for each case if necessary.

I found this paper but it discusses the Strassen-Winograd algorithm which is overkill for my small matrices. The answers to this question seem to indicate that if out == lhs && out == rhs (i.e., we're attempting to square the matrix), then it can't be done in place, but even there there's no convincing evidence or proof.

394

asked Aug 22 '14 15:08

Tavian Barnes

1 Answers

I'm not thrilled with this answer (I'm posting it mainly to silence the "it obviously can't be done" crowd), but I'm skeptical that it's possible to do much better with a true in-place algorithm (O(1) extra words of storage for multiplying two n x n matrices). Let's call the two matrices to be multplied A and B. Assume that A and B are not aliased.

If A were upper-triangular, then the multiplication problem would look like this.

[a11 a12 a13 a14] [b11 b12 b13 b14]
[ 0  a22 a23 a24] [b21 b22 b23 b24]
[ 0   0  a33 a34] [b31 b32 b33 b34]
[ 0   0   0  a44] [b41 b42 b43 b44]

We can compute the product into B as follows. Multiply the first row of B by a11. Add a12 times the second row of B to the first. Add a13 times the third row of B to the first. Add a14 times the fourth row of B to the first.

Now, we've overwritten the first row of B with the correct product. Fortunately, we don't need it any more. Multiply the second row of B by a22. Add a23 times the third row of B to the second. (You get the idea.)

Likewise, if A were unit lower-triangular, then the multiplication problem would look like this.

[ 1   0   0   0 ] [b11 b12 b13 b14]
[a21  1   0   0 ] [b21 b22 b23 b24]
[a31 a32  1   0 ] [b31 b32 b33 b34]
[a41 a42 a43  1 ] [b41 b42 b43 b44]

Add a43 times to third row of B to the fourth. Add a42 times the second row of B to the fourth. Add a41 times the first row of B to the fourth. Add a32 times the second row of B to the third. (You get the idea.)

The complete algorithm is to LU-decompose A in place, multiply U B into B, multiply L B into B, and then LU-undecompose A in place (I'm not sure if anyone ever does this, but it seems easy enough to reverse the steps). There are about a million reasons not to implement this in practice, two being that A may not be LU-decomposable and that A won't be reconstructed exactly in general with floating-point arithmetic.

176

answered Sep 22 '22 01:09

David Eisenstat

Related questions
                            
                                How to find all permutations of a given word in a given text?
                            
                                Method to uniformly randomly populate a disk with points in python
                            
                                Is two pointer problem same as sliding window
                            
                                Find the best combination from a given set of multiple sets
                            
                                Graphs: find a sink in less than O(|V|) - or show it can't be done
                            
                                Algorithm to make a String nice or ugly
                            
                                PHP: How to sort values of an array in alphabetical order?
                            
                                Is there a good radixsort-implementation for floats in C#
                            
                                An algorithm for converting a base-10 number to a base-N number
                            
                                On Xorshift random number generator algorithm
                            
                                Should we used k-means++ instead of k-means?
                            
                                Clustering using Latent Dirichlet Allocation algo in gensim
                            
                                Algorithm to optimize nested loops
                            
                                Improving performance of Racket Code and error when trying to byte compile
                            
                                Is it possible to implement quicksort with O(1) space complexity?
                            
                                Adjacency list and graph
                            
                                Is there a fast way to parse through a large file with regex?
                            
                                Count points inside triangle fast
                            
                                Find a sorted subsequence of size 4 in an array in linear time
                            
                                Search a string as you type the character

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there an algorithm to multiply square matrices in-place?

Tags:

language-agnostic

algorithm

matrix

graphics

linear-algebra

Tavian Barnes

People also ask

1 Answers

David Eisenstat

Recent Activity

Donate For Us