I have a question: how to get a sub matrix like a sub array by boolean slicing?
For example:
a2 = np.array(np.arange(30).reshape(5, 6))
a2[a2[:, 1] > 10]
will give me:
array([[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29]])
but:
m2 = np.mat(np.arange(30).reshape(5, 6))
m2[m2[:, 1] > 10]
will give me:
matrix([[12, 18, 24]])
Why the output is different and How can I get the same result as array from matrix?
Thank you!
The issue you're experiencing comes down to the fact that operations on a matrix return always return a 2-dimensional array.
When you build the mask on the first array, you get:
In [24]: a2[:,1] > 10
Out[24]: array([False, False, True, True, True], dtype=bool)
which, as you can see, is a 1-dimensional array.
When you do the same thing with the matrix, you get:
In [25]: m2[:,1] > 10
Out[25]:
matrix([[False],
[False],
[ True],
[ True],
[ True]], dtype=bool)
In other words, you have a nx1 array, not an array of length n.
Indexing in numpy operates differently depending on whether you're indexing with a one or n dimensional array.
In your first case, numpy will treat the array of length n as row indices, so you'll get the expected result:
In [28]: a2[a2[:,1] > 10]
Out[28]:
array([[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29]])
In the second case, because you have a 2-dimensional index array, numpy has enough information to extract both the row and the column, and so it only grabs things from the matching column (the first one):
In [29]: m2[m2[:,1] > 10]
Out[29]: matrix([[12, 18, 24]])
To answer your question: you can get this behaviour by converting your masks to an array and grabbing the first column, to extract your initial array of length n:
In [32]: m2[np.array(m2[:,1] > 10)[:,0]]
Out[32]:
matrix([[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29]])
Alternatively, you could do the conversion first, getting the same result as before:
In [34]: np.array(m2)[:,1] > 10
Out[34]: array([False, False, True, True, True], dtype=bool)
Now, both of those fixes require conversions between matrices and arrays, which can be pretty ugly.
The question I'd be asking yourself is why you wish to use a matrix, and yet expect the behaviour of an array. It could be that the right tool for your job is actually an array, not a matrix.
If you flatten the boolean mask like:
m2[np.asarray(m2[:,1]>10).flatten()]
you get the same result, but I would recommend using np.array
instead of np.matrix
for the reasons given in this answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With