In both MATLAB and Numpy, arrays can be indexed by arrays. However, the behavior is different. Let me explain this by an example.
MATLAB:
>> A = rand(5,5)
A =
0.1622 0.6020 0.4505 0.8258 0.1067
0.7943 0.2630 0.0838 0.5383 0.9619
0.3112 0.6541 0.2290 0.9961 0.0046
0.5285 0.6892 0.9133 0.0782 0.7749
0.1656 0.7482 0.1524 0.4427 0.8173
>> A([1,3,5],[1,3,5])
ans =
0.1622 0.4505 0.1067
0.3112 0.2290 0.0046
0.1656 0.1524 0.8173
Numpy:
In [2]: A = arange(25).reshape((5,5))
In [3]: A
Out[3]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
In [6]: A[[0,2,4], [0,2,4]]
Out[6]: array([ 0, 12, 24])
In words: MATLAB selects rows and columns, Numpy "zips" the two index arrays and uses the tuples to point to entries.
How can I get the MATLAB behavior with Numpy?
Boolean indexing returns a copy of the data, not a view of the original data, like one gets for slices. I can manipulate b and data is preserved. However, as you've identified, assignments made via indexed arrays are always made to the original data.
The time matlab takes to complete the task is 0.252454 seconds while numpy 0.973672151566, that is almost four times more.
Indexing can be done in numpy by using an array as an index. In case of slice, a view or shallow copy of the array is returned but in index array a copy of the original array is returned. Numpy arrays can be indexed with other arrays or any other sequence with the exception of tuples.
Those who are transitioning from academic research will find Python's NumPy library to be a natural transition point because of its similarity to the MATLAB programming language. Proficiency in NumPy brings the data scientist one step closer to unlocking Python's full potential for comprehensive data analytics.
You can use the helper function numpy.ix_
to get the Matlab behaviour:
from numpy import ix_
A[ ix_( [0,2,4], [0,2,4] ) ]
You can do this:
A[[0,2,4],:][:,[0,2,4]]
which will give the MATLAB-like result you want.
It's worth being aware that, rather inconsistently, if you use slices for indexing then you get MATLAB-like results without any such hackery:
>>> A[1:3,1:3]
array([[ 6, 7],
[11,12]])
In numpy, unlike MATLAB, 1:3
is not just an abbreviation for [1,2]
or anything of the kind. (At which point I feel obliged to mention something you surely know already, namely that Python's 1:3
is kinda like [1,2]
whereas MATLAB's is kinda like [1,2,3]
: the right-hand endpoint is included in MATLAB and excluded in Python.)
The efficient way to do this with numpy is to reshape your index array to match the axes they are indexing i.e.
In [103]: a=numpy.arange(100).reshape(10,10)
In [104]: a
Out[104]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
In [105]: x=numpy.array([3,6,9])
In [106]: y=numpy.array([2,7,8])
In [107]: a[x[:,numpy.newaxis],y[numpy.newaxis,:]]
Out[107]:
array([[32, 37, 38],
[62, 67, 68],
[92, 97, 98]])
Numpy's rules of broadcasting are your friend (and so much better than matlab)...
HTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With