What is the most pythonic way of splitting a NumPy matrix (a 2-D array) into equal chunks both vertically and horizontally?
For example :
aa = np.reshape(np.arange(270),(18,15)) # a 18x15 matrix
then a "function" like
ab = np.split2d(aa,(2,3))
would result in a list of 6 matrices shaped (9,5) each. The first guess is combine hsplit, map and vsplit, but how the mar has to be applied if there are two parameters to define for it, like :
map(np.vsplit(@,3),np.hsplit(aa,2))
NumPy: hsplit() function The hsplit() function is used to split an array into multiple sub-arrays horizontally (column-wise). hsplit is equivalent to split with axis=1, the array is always split along the second axis regardless of the array dimension.
NumPy: vsplit() function The vsplit() function is used to split an array into multiple sub-arrays vertically (row-wise). Note: vsplit is equivalent to split with axis=0 (default), the array is always split along the first axis regardless of the array dimension.
array_split() method in Python is used to split an array into multiple sub-arrays of equal size.
Here's one approach staying within NumPy environment -
def view_as_blocks(arr, BSZ):
# arr is input array, BSZ is block-size
m,n = arr.shape
M,N = BSZ
return arr.reshape(m//M, M, n//N, N).swapaxes(1,2).reshape(-1,M,N)
Sample runs
1) Actual big case to verify shapes :
In [41]: aa = np.reshape(np.arange(270),(18,15))
In [42]: view_as_blocks(aa, (9,5)).shape
Out[42]: (6, 9, 5)
2) Small case to manually verify values:
In [43]: aa = np.reshape(np.arange(36),(6,6))
In [44]: aa
Out[44]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
In [45]: view_as_blocks(aa, (2,3)) # Blocks of shape (2,3)
Out[45]:
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]],
[[24, 25, 26],
[30, 31, 32]],
[[27, 28, 29],
[33, 34, 35]]])
If you are willing to work with other libraries, scikit-image
could be of use here, like so -
from skimage.util import view_as_blocks as viewB
out = viewB(aa, tuple(BSZ)).reshape(-1,*BSZ)
Runtime test -
In [103]: aa = np.reshape(np.arange(270),(18,15))
# @EFT's soln
In [99]: %timeit split_2d(aa, (2,3))
10000 loops, best of 3: 23.3 µs per loop
# @glegoux's soln-1
In [100]: %timeit list(get_chunks(aa, 2,3))
100000 loops, best of 3: 3.7 µs per loop
# @glegoux's soln-2
In [111]: %timeit list(get_chunks2(aa, 9, 5))
100000 loops, best of 3: 3.39 µs per loop
# Proposed in this post
In [101]: %timeit view_as_blocks(aa, (9,5))
1000000 loops, best of 3: 1.86 µs per loop
Please note that I have used (2,3)
for split_2d
and get_chunks
as by their definitions, they are using that as the number of blocks. In my case with view_as_blocks
, I have the parameter BSZ
indicating the block size. So, I have (9,5)
there. get_chunks2
follows the same format as view_as_blocks
. The outputs should represent the same there.
You could use np.split
& np.concatenate
, the latter to allow the second split to be conducted in a single step:
def split_2d(array, splits):
x, y = splits
return np.split(np.concatenate(np.split(array, y, axis=1)), x*y)
ab = split_2d(aa,(2,3))
ab[0].shape
Out[95]: (9, 5)
len(ab)
Out[96]: 6
This also seems like it should be relatively straightforward to generalize to the n-dim case, though I haven't followed that thought all the way through just yet.
Edit:
For a single array as output, just add np.stack
:
np.stack(ab).shape
Out[99]: (6, 9, 5)
To cut, this matrix (18,15) :
+-+-+-+
+ +
+-+-+-+
in 2x3 blocks (9,5) like it :
+-+-+-+
+-+-+-+
+-+-+-+
Do:
from pprint import pprint
import numpy as np
M = np.reshape(np.arange(18*15),(18,15))
def get_chunks(M, n, p):
n = len(M)//n
p = len(M[0])//p
for i in range(0, len(M), n):
for j in range(0, len(M[0]), p):
yield M[i:i+n,j:j+p]
def get_chunks2(M, n, p):
for i in range(0, len(M), n):
for j in range(0, len(M[0]), p):
yield M[i:i+n,j:j+p]
# list(get_chunks2(M, 9, 5)) same result more faster
chunks = list(get_chunks(M, 2, 3))
pprint(chunks)
Output:
[array([[ 0, 1, 2, 3, 4],
[ 15, 16, 17, 18, 19],
[ 30, 31, 32, 33, 34],
[ 45, 46, 47, 48, 49],
[ 60, 61, 62, 63, 64],
[ 75, 76, 77, 78, 79],
[ 90, 91, 92, 93, 94],
[105, 106, 107, 108, 109],
[120, 121, 122, 123, 124]]),
array([[ 5, 6, 7, 8, 9],
[ 20, 21, 22, 23, 24],
[ 35, 36, 37, 38, 39],
[ 50, 51, 52, 53, 54],
[ 65, 66, 67, 68, 69],
[ 80, 81, 82, 83, 84],
[ 95, 96, 97, 98, 99],
[110, 111, 112, 113, 114],
[125, 126, 127, 128, 129]]),
array([[ 10, 11, 12, 13, 14],
[ 25, 26, 27, 28, 29],
[ 40, 41, 42, 43, 44],
[ 55, 56, 57, 58, 59],
[ 70, 71, 72, 73, 74],
[ 85, 86, 87, 88, 89],
[100, 101, 102, 103, 104],
[115, 116, 117, 118, 119],
[130, 131, 132, 133, 134]]),
array([[135, 136, 137, 138, 139],
[150, 151, 152, 153, 154],
[165, 166, 167, 168, 169],
[180, 181, 182, 183, 184],
[195, 196, 197, 198, 199],
[210, 211, 212, 213, 214],
[225, 226, 227, 228, 229],
[240, 241, 242, 243, 244],
[255, 256, 257, 258, 259]]),
array([[140, 141, 142, 143, 144],
[155, 156, 157, 158, 159],
[170, 171, 172, 173, 174],
[185, 186, 187, 188, 189],
[200, 201, 202, 203, 204],
[215, 216, 217, 218, 219],
[230, 231, 232, 233, 234],
[245, 246, 247, 248, 249],
[260, 261, 262, 263, 264]]),
array([[145, 146, 147, 148, 149],
[160, 161, 162, 163, 164],
[175, 176, 177, 178, 179],
[190, 191, 192, 193, 194],
[205, 206, 207, 208, 209],
[220, 221, 222, 223, 224],
[235, 236, 237, 238, 239],
[250, 251, 252, 253, 254],
[265, 266, 267, 268, 269]])]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With