Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy - split a matrix considering offsets

Given an m x n matrix I want to split it into square a x a (a = 3 or a = 4) matrices of arbitrary offset (minimal offset = 1, max offset = block size), like Mathematica's Partition function does:

For example, given a 4 x 4 matrix A like

1  2  3  4 
5  6  7  8
9  10 11 12
13 14 15 16

If I give 3 x 3 blocks and offset = 1, I want to get the 4 matrices:

1  2  3 
5  6  7 
9  10 11

2  3  4 
6  7  8
10 11 12

5  6  7 
9  10 11
13 14 15

6  7  8
10 11 12
14 15 16

If matrix A is A = np.arange(1, 37).reshape((6,6)) and I use 3 x 3 blocks with offset = 3, I want as output the blocks:

1  2  3
7  8  9
3 14 15

 4  5  6
10 11 12
16 17 18

19 20 21
25 26 27
31 32 33

22 23 24
28 29 30
34 35 36

I'm ok with matrix A being a list of lists and I think that I don't need NumPy's functionality. I was surprised that neither array_split nor numpy.split provide this offset option out of the box, is it more straightforward to code this in pure Python with slicing or should I look into NumPy's strides? I want the code to be highly legible.

like image 748
andandandand Avatar asked Oct 19 '22 03:10

andandandand


1 Answers

As you hint, there is a way of doing this with strides

In [900]: M = np.lib.stride_tricks.as_strided(A, shape=(2,2,3,3), strides=(16,4,16,4))
In [901]: M
Out[901]: 
array([[[[ 1,  2,  3],
         [ 5,  6,  7],
         [ 9, 10, 11]],

        [[ 2,  3,  4],
         [ 6,  7,  8],
         [10, 11, 12]]],


       [[[ 5,  6,  7],
         [ 9, 10, 11],
         [13, 14, 15]],

        [[ 6,  7,  8],
         [10, 11, 12],
         [14, 15, 16]]]])
In [902]: M.reshape(4,3,3)  # to get it in form you list
Out[902]: 
array([[[ 1,  2,  3],
        [ 5,  6,  7],
        [ 9, 10, 11]],

       [[ 2,  3,  4],
        [ 6,  7,  8],
        [10, 11, 12]],

       [[ 5,  6,  7],
        [ 9, 10, 11],
        [13, 14, 15]],

       [[ 6,  7,  8],
        [10, 11, 12],
        [14, 15, 16]]])

A problem with strides is that it is advanced, and hard to explain to someone without much numpy experience. I figured out the form without much trial and error, but I've been hanging around here too long. :) ).

But this iterative solution is easier to explain:

In [909]: alist=[]
In [910]: for i in range(2):
     ...:     for j in range(2):
     ...:         alist.append(A[np.ix_(range(i,i+3),range(j,j+3))])
     ...:         
In [911]: alist
Out[911]: 
[array([[ 1,  2,  3],
        [ 5,  6,  7],
        [ 9, 10, 11]]), 
 array([[ 2,  3,  4],
        [ 6,  7,  8],
        [10, 11, 12]]), 
 array([[ 5,  6,  7],
        [ 9, 10, 11],
        [13, 14, 15]]), 
 array([[ 6,  7,  8],
        [10, 11, 12],
        [14, 15, 16]])]

Which can be turned into an array with np.array(alist). There's nothing wrong with using this if it is clearer.

One thing to keep in mind about the as_strided approach is that it is a view, and changes to M may change A, and a change in one place in M may modify several places in M. But that reshaping M may turn it into a copy. So overall it's safer to read values from M, and use them for calculations like sum and mean. In place changes can be unpredictable.

The iterative solution produces copies all around.

The iterative solution with np.ogrid instead of np.ix_ (otherwise the same idea):

np.array([A[np.ogrid[i:i+3, j:j+3]] for i in range(2) for j in range(2)])

both ix_ and ogrid are just easy ways constructing the pair of vectors for indexing a block:

In [970]: np.ogrid[0:3, 0:3]
Out[970]: 
[array([[0],
        [1],
        [2]]), array([[0, 1, 2]])]

The same thing but with slice objects:

np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(2) for j in range(2)])

The list version of this would have similar view behavior as the as_strided solution (the elements of the list are views).

For the 6x6 with non-overlapping blocks, try:

In [1016]: np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(0,6,3) for j i
      ...: n range(0,6,3)])
Out[1016]: 
array([[[ 1,  2,  3],
        [ 7,  8,  9],
        [13, 14, 15]],

       [[ 4,  5,  6],
        [10, 11, 12],
        [16, 17, 18]],

       [[19, 20, 21],
        [25, 26, 27],
        [31, 32, 33]],

       [[22, 23, 24],
        [28, 29, 30],
        [34, 35, 36]]])

Assuming you want contiguous blocks, the inner slices/ranges don't change, just the stepping for the outer i and j

In [1017]: np.arange(0,6,3)
Out[1017]: array([0, 3])
like image 165
hpaulj Avatar answered Jan 02 '23 00:01

hpaulj