Routine to extract linear independent rows from a rank deficient matrix

Tags:

I'm struggling with the following problem: I have some very big matrices (say, at least, 2000x2000, and probably in the future they will even reach 10000x10000) with very small rank (2 or 3, call it N) and I need to find an efficient Python routine to extract the linear independent rows (or columns, the matrix is symmetric!) From them. I tried to take the first N columns of the Q matrix of QR decomposition but it seems not to work correctly (is this wrong maybe?).

Here's the Python code I use to implement the method suggested by Ami Tavory:

from numpy import absolute
from numpy.linalg import qr

q = qr(R)[1] #R is my matrix
q = absolute(q)
sums = sum(q,axis=1)

i = 0
while( i < dim ): #dim is the matrix dimension
    if(sums[i] > 1.e-10):
       print "%d is a good index!" % i
    i += 1

This should tell me if the row is non-zero and therefore if the I-th column of R is linearly independent.

929

asked Jun 27 '15 20:06

Simone Bolognini

1 Answers

The Gram Schmidt process finds a basis (equivalently largest independent subset) using linear combinations, and the QR Decomposition effectively mimics this.

Therefore, one way to do what you want is to apply numpy.linalg.qr to the transpose, and check the non-zero components of the R matrix. The corresponding columns (in the transpose matrix, i.e., the rows in your original matrix) are independent.

Edit After some searching, I believe this Berkeley lecture explains it, but here are examples

import numpy as np

# 2nd column is redundant
a = np.array([[1, 0, 0], [0, 0, 0], [1, 0, 1]])
>> np.linalg.qr(a)[1] # 2nd row empty
array([[ 1.41421356,  0.        ,  0.70710678],
   [ 0.        ,  0.        ,  0.        ],
   [ 0.        ,  0.        ,  0.70710678]])

# 3rd column is redundant
a = np.array([[1, 0, 0], [1, 0, 1], [0, 0, 0], ])
>> np.linalg.qr(a)[1] # 3rd row empty
array([[ 1.41421356,  0.        ,  0.70710678],
   [ 0.        ,  0.        , -0.70710678],
   [ 0.        ,  0.        ,  0.        ]])

# No column redundant
a = np.array([[1, 0, 0], [1, 0, 1], [2, 3, 4], ])
>> np.linalg.qr(a)[1] # No row empty
array([[ 2.44948974,  2.44948974,  3.67423461],
   [ 0.        ,  1.73205081,  1.73205081],
   [ 0.        ,  0.        ,  0.70710678]])

147

answered Nov 07 '22 00:11

Ami Tavory

Related questions
                            
                                Python pyodbc Unicode issue
                            
                                Python - Stumbled upon "'DictReader' object is not subscriptable"
                            
                                Django: durationField default value
                            
                                plot multiple columns on same graph seaborn
                            
                                Add raster image to HDF5 file using h5py
                            
                                scipy.optimize.linprog unable to find a feasible starting point despite a feasible answer clearly exists
                            
                                Difficulty animating a matplotlib graph with moviepy
                            
                                Openpyxl Unicode Values
                            
                                how to locate the center of a bright spot in an image?
                            
                                Test django forms raised ValidationError
                            
                                Python multiprocessing - Is it possible to introduce a fixed time delay between individual processes?
                            
                                ffmpeg cut first 5 seconds
                            
                                Why do you need lambda to nest defaultdict?
                            
                                Scikit-learn : roc_auc_score
                            
                                Why does `class X: mypow = pow` work? What about `self`?
                            
                                Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings'
                            
                                How to present numpy array into pygame surface?
                            
                                ZMQ: No subscription message on XPUB socket for multiple subscribers (Last Value Caching pattern)
                            
                                Python YAML preserving newline without adding extra newline
                            
                                Export scraping data in multiple formats using scrapy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Routine to extract linear independent rows from a rank deficient matrix

Tags:

python

matrix

Simone Bolognini

People also ask

1 Answers

Ami Tavory

Recent Activity

Donate For Us