Given an m x n
matrix I want to split it into square a x a
(a = 3 or a = 4) matrices of arbitrary offset (minimal offset = 1, max offset = block size), like Mathematica's Partition
function does:
For example, given a 4 x 4 matrix A
like
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
If I give 3 x 3 blocks and offset = 1, I want to get the 4 matrices:
1 2 3
5 6 7
9 10 11
2 3 4
6 7 8
10 11 12
5 6 7
9 10 11
13 14 15
6 7 8
10 11 12
14 15 16
If matrix A
is A = np.arange(1, 37).reshape((6,6))
and I use 3 x 3 blocks with offset = 3, I want as output the blocks:
1 2 3
7 8 9
3 14 15
4 5 6
10 11 12
16 17 18
19 20 21
25 26 27
31 32 33
22 23 24
28 29 30
34 35 36
I'm ok with matrix A being a list of lists and I think that I don't need NumPy's functionality. I was surprised that neither array_split
nor numpy.split
provide this offset option out of the box, is it more straightforward to code this in pure Python with slicing or should I look into NumPy's strides? I want the code to be highly legible.
As you hint, there is a way of doing this with strides
In [900]: M = np.lib.stride_tricks.as_strided(A, shape=(2,2,3,3), strides=(16,4,16,4))
In [901]: M
Out[901]:
array([[[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]],
[[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]]])
In [902]: M.reshape(4,3,3) # to get it in form you list
Out[902]:
array([[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]],
[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]])
A problem with strides is that it is advanced, and hard to explain to someone without much numpy experience. I figured out the form without much trial and error, but I've been hanging around here too long. :) ).
But this iterative solution is easier to explain:
In [909]: alist=[]
In [910]: for i in range(2):
...: for j in range(2):
...: alist.append(A[np.ix_(range(i,i+3),range(j,j+3))])
...:
In [911]: alist
Out[911]:
[array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]]),
array([[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]),
array([[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]),
array([[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]])]
Which can be turned into an array with np.array(alist)
. There's nothing wrong with using this if it is clearer.
One thing to keep in mind about the as_strided
approach is that it is a view, and changes to M
may change A
, and a change in one place in M
may modify several places in M
. But that reshaping M
may turn it into a copy. So overall it's safer to read values from M
, and use them for calculations like sum
and mean
. In place changes can be unpredictable.
The iterative solution produces copies all around.
The iterative solution with np.ogrid
instead of np.ix_
(otherwise the same idea):
np.array([A[np.ogrid[i:i+3, j:j+3]] for i in range(2) for j in range(2)])
both ix_
and ogrid
are just easy ways constructing the pair of vectors for indexing a block:
In [970]: np.ogrid[0:3, 0:3]
Out[970]:
[array([[0],
[1],
[2]]), array([[0, 1, 2]])]
The same thing but with slice
objects:
np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(2) for j in range(2)])
The list version of this would have similar view
behavior as the as_strided
solution (the elements of the list are views).
For the 6x6 with non-overlapping blocks, try:
In [1016]: np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(0,6,3) for j i
...: n range(0,6,3)])
Out[1016]:
array([[[ 1, 2, 3],
[ 7, 8, 9],
[13, 14, 15]],
[[ 4, 5, 6],
[10, 11, 12],
[16, 17, 18]],
[[19, 20, 21],
[25, 26, 27],
[31, 32, 33]],
[[22, 23, 24],
[28, 29, 30],
[34, 35, 36]]])
Assuming you want contiguous blocks, the inner slices/ranges don't change, just the stepping for the outer i
and j
In [1017]: np.arange(0,6,3)
Out[1017]: array([0, 3])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With