Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a value from every column in a Numpy matrix

I'd like to get the index of a value for every column in a matrix M. For example:

M = matrix([[0, 1, 0],
            [4, 2, 4],
            [3, 4, 1],
            [1, 3, 2],
            [2, 0, 3]])

In pseudocode, I'd like to do something like this:

for col in M:
    idx = numpy.where(M[col]==0) # Only for columns!

and have idx be 0, 4, 0 for each column.

I have tried to use where, but I don't understand the return value, which is a tuple of matrices.

like image 564
jds Avatar asked Dec 15 '14 16:12

jds


2 Answers

The tuple of matrices is a collection of items suited for indexing. The output will have the shape of the indexing matrices (or arrays), and each item in the output will be selected from the original array using the first array as the index of the first dimension, the second as the index of the second dimension, and so on. In other words, this:

>>> numpy.where(M == 0)
(matrix([[0, 0, 4]]), matrix([[0, 2, 1]]))
>>> row, col = numpy.where(M == 0)
>>> M[row, col]
matrix([[0, 0, 0]])
>>> M[numpy.where(M == 0)] = 1000
>>> M
matrix([[1000,    1, 1000],
        [   4,    2,    4],
        [   3,    4,    1],
        [   1,    3,    2],
        [   2, 1000,    3]])

The sequence may be what's confusing you. It proceeds in flattened order -- so M[0,2] appears second, not third. If you need to reorder them, you could do this:

>>> row[0,col.argsort()]
matrix([[0, 4, 0]])

You also might be better off using arrays instead of matrices. That way you can manipulate the shape of the arrays, which is often useful! Also note ajcr's transpose-based trick, which is probably preferable to using argsort.

Finally, there is also a nonzero method that does the same thing as where in this case. Using the transpose trick now:

>>> (M == 0).T.nonzero()
(matrix([[0, 1, 2]]), matrix([[0, 4, 0]]))
like image 136
senderle Avatar answered Oct 14 '22 02:10

senderle


As an alternative to np.where, you could perhaps use np.argwhere to return an array of indexes where the array meets the condition:

>>> np.argwhere(M == 0)
array([[[0, 0]],

       [[0, 2]],

       [[4, 1]]])

This tells you each the indexes in the format [row, column] where the condition was met.

If you'd prefer the format of this output array to be grouped by column rather than row, (that is, [column, row]), just use the method on the transpose of the array:

>>> np.argwhere(M.T == 0).squeeze()
array([[0, 0],
       [1, 4],
       [2, 0]])

I also used np.squeeze here to get rid of axis 1, so that we are left with a 2D array. The sequence you want is the second column, i.e. np.argwhere(M.T == 0).squeeze()[:, 1].

like image 42
Alex Riley Avatar answered Oct 14 '22 02:10

Alex Riley