Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting a column of a numpy array

I am somewhat confused about selecting a column of an NumPy array, because the result is different from Matlab and even from NumPy matrix. Please see the following cases.

In Matlab, we use the following command to select a column vector out of a matrix.

x = [0, 1; 2 3]
out = x(:, 1)

Then out becomes [0; 2], which is a column vector.

To do the same thing with a NumPy Matrix

x = np.matrix([[0, 1], [2, 3]])
out = x[:, 0]

Then the output is np.matrix([[0], [2]]) which is expected, and it is a column vector.

However, in case of NumPy array

x = np.array([[0, 1], [2, 3]])
out = x[:, 0]

The output is np.array([0, 2]) which is 1 dimensional, so it is not a column vector. My expectation is it should have been np.array([[0], [2]]). I have two questions.

1. Why is the output from the NumPy array case different form the NumPy matrix case? This is causing a lot of confusion to me, but I think there might be some reason for this.

2. To get a column vector from a 2-Dim NumPy Array, then should I do additional things like expand_dims

x = np.array([[0, 1], [2, 3]])
    out = np.expand_dims(x[:, 0], axis = 1)
like image 629
chanwcom Avatar asked Dec 15 '22 07:12

chanwcom


1 Answers

In MATLAB everything has atleast 2 dimensions. In older MATLABs, 2d was it, now they can have more. np.matrix is modeled on that old MATLAB.

What does MATLAB do when you index a 3d matrix?

np.array is more general. It can have 0, 1, 2 or more dimensions.

x[:, 0]
x[0, :]

both select one column or row, and return an array with one less dimension.

x[:, [0]]
x[[0], :]

would return 2d arrays, with a singleton dimension.

In Octave (MATLAB clone) indexing produces inconsistent results, depending on which side of matrix I select:

octave:7> x=ones(2,3,4);
octave:8> size(x)
ans =
   2   3   4

octave:9> size(x(1,:,:))
ans =
   1   3   4

octave:10> size(x(:,:,1))
ans =    
   2   3

MATLAB/Octave adds dimensions at the end, and apparently readily squeezes them down on that side as well.

numpy orders the dimensions in the other direction, and can add dimensions at the start as needed. But it is consistent in squeezing out singleton dimensions when indexing.

The fact that numpy can have any number of dimensions, while MATLAB has a minimum of 2 is a crucial difference that often trips up MATLAB users. But one isn't any more logical than the other. MATLAB's practice is more a more matter of history than general principals.

like image 170
hpaulj Avatar answered Jan 03 '23 06:01

hpaulj