Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matrix indexing in Numpy

I was growing confused during the development of a small Python script involving matrix operations, so I fired up a shell to play around with a toy example and develop a better understanding of matrix indexing in Numpy.

This is what I did:

>>> import numpy as np
>>> A = np.matrix([1,2,3])
>>> A
matrix([[1, 2, 3]])
>>> A[0]
matrix([[1, 2, 3]])
>>> A[0][0]
matrix([[1, 2, 3]])
>>> A[0][0][0]
matrix([[1, 2, 3]])
>>> A[0][0][0][0]
matrix([[1, 2, 3]])

As you can imagine, this has not helped me develop a better understanding of matrix indexing in Numpy. This behavior would make sense for something that I would describe as "An array of itself", but I doubt anyone in their right mind would choose that as a model for matrices in a scientific library.

What is, then, the logic to the output I obtained? Why would the first element of a matrix object be itself?

PS: I know how to obtain the first entry of the matrix. What I am interested in is the logic behind this design decision.

EDIT: I'm not asking how to access a matrix element, or why a matrix row behaves like a matrix. I'm asking for a definition of the behavior of a matrix when indexed with a single number. It's an action typical of arrays, but the resulting behavior is nothing like the one you would expect from an array. I would like to know how this is implemented and what's the logic behind the design decision.

like image 590
cangrejo Avatar asked Dec 01 '15 17:12

cangrejo


People also ask

How are arrays indexed in NumPy?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

How is a 2D array indexed?

Two-dimensional (2D) arrays are indexed by two subscripts, one for the row and one for the column. Each element in the 2D array must by the same type, either a primitive type or object type.

How do you represent a matrix in NumPy?

To construct a matrix in numpy we list the rows of the matrix in a list and pass that list to the numpy array constructor. The first slice selects all rows in A, while the second slice selects just the middle entry in each row. To do a matrix multiplication or a matrix-vector multiplication we use the np. dot() method.


2 Answers

Look at the shape after indexing:

In [295]: A=np.matrix([1,2,3])
In [296]: A.shape
Out[296]: (1, 3)
In [297]: A[0]
Out[297]: matrix([[1, 2, 3]])
In [298]: A[0].shape
Out[298]: (1, 3)

The key to this behavior is that np.matrix is always 2d. So even if you select one row (A[0,:]), the result is still 2d, shape (1,3). So you can string along as many [0] as you like, and nothing new happens.

What are you trying to accomplish with A[0][0]? The same as A[0,0]? For the base np.ndarray class these are equivalent.

Note that Python interpreter translates indexing to __getitem__ calls.

 A.__getitem__(0).__getitem__(0)
 A.__getitem__((0,0))

[0][0] is 2 indexing operations, not one. So the effect of the second [0] depends on what the first produces.

For an array A[0,0] is equivalent to A[0,:][0]. But for a matrix, you need to do:

In [299]: A[0,:][:,0]
Out[299]: matrix([[1]])  # still 2d

=============================

"An array of itself", but I doubt anyone in their right mind would choose that as a model for matrices in a scientific library.

What is, then, the logic to the output I obtained? Why would the first element of a matrix object be itself?

In addition, A[0,:] is not the same as A[0]

In light of these comments let me suggest some clarifications.

A[0] does not mean 'return the 1st element'. It means select along the 1st axis. For a 1d array that means the 1st item. For a 2d array it means the 1st row. For ndarray that would be a 1d array, but for a matrix it is another matrix. So for a 2d array or matrix, A[i,:] is the same thing as A[i].

A[0] does not just return itself. It returns a new matrix. Different id:

In [303]: id(A)
Out[303]: 2994367932
In [304]: id(A[0])
Out[304]: 2994532108

It may have the same data, shape and strides, but it's a new object. It's just as unique as the ith row of a many row matrix.

Most of the unique matrix activity is defined in: numpy/matrixlib/defmatrix.py. I was going to suggest looking at the matrix.__getitem__ method, but most of the action is performed in np.ndarray.__getitem__.

np.matrix class was added to numpy as a convenience for old-school MATLAB programmers. numpy arrays can have almost any number of dimensions, 0, 1, .... MATLAB allowed only 2, though a release around 2000 generalized it to 2 or more.

like image 123
hpaulj Avatar answered Oct 13 '22 15:10

hpaulj


Imagine you have the following

>> A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]]) 

If you want to get the second column value, use the following:

>> A.T[1]
array([ 2,  6, 10])
like image 24
Mona Jalal Avatar answered Oct 13 '22 13:10

Mona Jalal