I have a 3D matrix A
of dimensions h x w x c
. I want to extract patches of dimensions ph x pw
from each "channel" c
. ph
divides h
and pw
divides w
. In this example,
h x w x c = 4 x 4 x 3
ph x pw = 2 x 2
I know how to do this in tensorflow using gather_nd
but I was hoping for something more efficient in terms of setting it up, because the dimensions will be big and I'd rather not have the indices array of gather_nd
in memory. Is there possibly an intelligent reshape? Either numpy or tensorflow solution would be very nice!
You could use some reshaping and swapping of axes -
A.reshape(h//ph,ph,w//pw,pw,-1).swapaxes(1,2)
Sample run -
In [46]: # Sample inputs
...: h,w,c = 10,12,3
...: ph, pw = 2,2
...: A = np.random.randint(0,9,(h,w,c))
...:
In [47]: A.reshape(h//ph,ph,w//pw,pw,-1).swapaxes(1,2).shape
Out[47]: (5, 6, 2, 2, 3)
Each element (as block) along first two axes represent the patches. Thus. for the sample provided, we would have 5 x 6 = 30
patches.
If you want those patches along one merged first axis, use one more reshape
-
In [85]: out = A.reshape(h//ph,ph,w//pw,pw,-1).swapaxes(1,2).reshape(-1,ph,pw,c)
In [86]: out.shape
Out[86]: (30, 2, 2, 3)
Let's verify by manually inspecting values themselves -
In [81]: A[:ph,:pw] # First patch
Out[81]:
array([[[6, 5, 2],
[4, 0, 1]],
[[0, 0, 4],
[2, 3, 0]]])
In [82]: A[:ph,pw:2*pw] # Second patch
Out[82]:
array([[[8, 3, 3],
[0, 0, 2]],
[[8, 5, 4],
[3, 4, 6]]])
In [83]: out[0]
Out[83]:
array([[[6, 5, 2],
[4, 0, 1]],
[[0, 0, 4],
[2, 3, 0]]])
In [84]: out[1]
Out[84]:
array([[[8, 3, 3],
[0, 0, 2]],
[[8, 5, 4],
[3, 4, 6]]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With