Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy Routine for Computing Matrix Minors?

Tags:

python

numpy

I'm interested in using numpy to compute all of the minors of a given square matrix. Is there a slick way of using array slicing to do this? I'm imagining that one can rotate the columns, delete the last column, rotate the rows of the resulting matrix and delete the last row, but I haven't found anything in the numpy documentation that indicates this is possible.

(Q: Why do this? A: I have a long sequence {M_n} of fairly large matrices, roughly 1,000,000 10,000 x 10,000 matrices, and I want to compute the determinant of each matrix. Each matrix is obtained from its predecessor by changing just one coefficient. It's going to be quite a lot faster to compute the determinant of the first matrix in the sequence, and then compute the difference det(M_{n+1}) - det(M_n), which is the product of the changed coefficient and its minor.)

like image 281
user1504 Avatar asked Oct 04 '10 18:10

user1504


People also ask

How do you represent a matrix in NumPy?

The numpy ndarray class is used to represent both matrices and vectors. To construct a matrix in numpy we list the rows of the matrix in a list and pass that list to the numpy array constructor. The first slice selects all rows in A, while the second slice selects just the middle entry in each row.

What is a minor in a matrix?

Each element in a square matrix has its own minor. The minor is the value of the determinant of the matrix that results from crossing out the row and column of the element under consideration.


2 Answers

I was thinking about this exact problem the other day and did a couple of tries and performance tests for this matter, so I'll share what I found.

Adding to the solutions provided by PaulDong and unutbu, I came up with two additional solutions. One (minor_mask()) uses fancy mask-based indexing of Numpy arrays and the other, (minor_fortran()), is a solution I came up while playing around with good ol' Fortran and slightly modified it to compile with Numba. Putting all the solutions together and performing some benchmarks:

The example code

import numpy as np
import numba


def minor_mask(A, i, j):
    """Own solution, based on NumPy fancy indexing"""
    mask = np.ones_like(A, dtype=bool)
    mask[i, :] = False
    mask[:, j] = False

    minor = A[mask].reshape(A.shape[0] - 1, A.shape[1] - 1)

    del mask

    return minor


def minor_unutbu(A, i, j):
    """Solution provided by unutbu"""
    return A[
        np.array(list(range(i)) + list(range(i + 1, A.shape[0])))[:, np.newaxis],
        np.array(list(range(j)) + list(range(j + 1, A.shape[1]))),
    ]


def minor_pauldong(A, i, j):
    """Solution by PaulDong"""
    return np.delete(np.delete(A, i, axis=0), j, axis=1)


@numba.njit()
def minor_fortran(A, i, j):
    """
    Matrix minor calculation based on a Fortran routine.
    Compiles nicely with numba.
    """

    minor = np.zeros((A.shape[0] - 1, A.shape[0] - 1))

    for m in range(A.shape[0]):
        ishift = 0
        jshift = 0

        if m > i:
            ishift = 1

        for n in range(A.shape[1]):
            if n > j:
                jshift = 1

            minor[m - ishift, n - jshift] = A[m, n]

    return minor

Performance testing

On my machine (i5 9600K, 32 GB RAM, openSUSE Leap 15.2, Python 3.8.9, Numpy 1.20.3, Numba 0.53.1, ipykernel 5.5.5), I get the following results for small and large matrices using the following code:

m_small = np.arange(1e4).reshape(100, 100)
m_large = np.arange(1e8).reshape(10000, 10000)

# First run for setup of Numba and comparing results
r1 = minor_mask(m_large, 10, 11)
r2 = minor_unutbu(m_large, 10, 11)
r3 = minor_pauldong(m_large, 10, 11)
r4 = minor_fortran(m_large, 10, 11)

print(np.all(r1 == r2))
# --> True
print(np.all(r2 == r3))
# --> True
print(np.all(r3 == r4))
# --> True

# Large matrices
%timeit minor_mask(m_large, 10, 10)
# 136 ms ± 1.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit minor_unutbu(m_large, 10, 10)
# 247 ms ± 8.31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit minor_pauldong(m_large, 10, 10)
# 217 ms ± 3.79 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit minor_fortran(m_large, 10, 10)
# 153 ms ± 1.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# Small matrices
%timeit minor_mask(m_small, 10, 10)
# 12.4 µs ± 22.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit minor_unutbu(m_small, 10, 10)
# 36.7 µs ± 140 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit minor_pauldong(m_small, 10, 10)
# 14.5 µs ± 102 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit minor_fortran(m_small, 10, 10)
#10.4 µs ± 34.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Wrap-up

So, together, we find that the list-based approach by @unutbu does perform worst for both test cases, followed by @PaulDongs approach (although IMHO being the cleanest of all solutions).
The fancy indexing approach seems to perform consistently well for both small and large matrices, only surpassed by the compiled solution for small matrices. I anticipate the mask approach to be inefficient for very large matrices, as we need to store the boolean mask in memory to do the masking, but I did not perform any memory profiling here.

I know that the comparison is not on equal footing when including the numba solution, but I'd say it's nonetheless relevant for number crunching with Python.

like image 191
FabianGD Avatar answered Oct 23 '22 02:10

FabianGD


The answer provided by unutbu is already great, and to optimize the algorithm @ev-br's answer had set me up on an interesting journey.

My answer below to the question is just to make it more explict of the intent.

import numpy as np

arr = np.random.normal(0,1,(4,4))



def matrix_minor(arr, i, j):
    return np.delete(np.delete(arr,i,axis=0), j, axis=1)

# tests
arr

matrix_minor(arr, 0, 0)

matrix_minor(arr, 0, 1)
like image 40
PaulDong Avatar answered Oct 23 '22 04:10

PaulDong