I have an HxW "feature map", F. Let us assume that it is a HxWx1 map. Through some other operation, I have a set of pixels that are of interest to me, (say N pixels). Each of these pixels is associated with a different value, thus my set is of the form Nx3 where each pixel is of the form x, y and val. Note that this val is different from the feature map value at the location.
Here is my question. Is it possible to vectorize a neighbourhood operation for each of these points? For each pixel n from N, I wish to multiply the corresponding val to its 3x3 neighbourhood in the feature map F. For the 3x3 neighbourhood, this gives a new 3x3 set of elements new val. I want to replace the x y with the pixel with the maximum of new val (multiplied feature map) in the 3x3 window.
This sounds similar to a convolution (slight abuse of terminology here) followed by a max pool operation, but not exactly since each pixel location has a different val to be multiplied.
Sample input and output, and walkthrough for required solution
Let us assume H=10 and W=10
Here is a sample F
0.635955 0.922379 0.993406 0.007837 0.818661 0.983730 0.199866 0.757519 0.073152 0.015831
0.397718 0.097353 0.231351 0.177886 0.343099 0.419940 0.017342 0.087294 0.402266 0.366337
0.978686 0.476594 0.067836 0.148977 0.058994 0.810586 0.542894 0.797419 0.386559 0.225982
0.479860 0.033354 0.353366 0.431562 0.336208 0.674272 0.398151 0.713732 0.598623 0.829230
0.940838 0.869564 0.287100 0.669844 0.631836 0.748982 0.762292 0.597999 0.540236 0.758802
0.925995 0.141296 0.466772 0.672663 0.929746 0.544029 0.991860 0.197474 0.762866 0.798973
0.543519 0.128332 0.624323 0.876569 0.050709 0.223705 0.708381 0.380842 0.818092 0.163447
0.283125 0.329618 0.283481 0.672950 0.136922 0.897785 0.385479 0.764824 0.132671 0.091148
0.661984 0.369459 0.501181 0.352681 0.554113 0.133283 0.593048 0.108534 0.397813 0.836065
0.654929 0.928576 0.539204 0.931213 0.344114 0.591214 0.126809 0.456681 0.036531 0.725228
My structure of pixels, let us say N=3
The three values in the order of row,col,val: (for simplicity I assume x is rows, and y is cols, though it isn't necessarily the case). This is completely independent of the feature map in the previous step.
3,2,0.38
4,4,0.602
7,5,0.9647
The neighborhood around (3,2) is:
[[0.4765941 , 0.06783561, 0.14897662],
[0.03335438, 0.35336647, 0.4315618 ],
[0.86956374, 0.28709952, 0.66984412]]
Thus val * neighborhood yields. (here val is 0.38)
[[0.18110576, 0.02577753, 0.05661112],
[0.01267466, 0.13427926, 0.16399349],
[0.33043422, 0.10909782, 0.25454077]]
The location of max value here is (2,0) i.e. (1,-1) with respect to center pixel. Thus my updated (x,y) should be (3,2) + (1,-1) = (4,1).
Similarly for the other two, the updated pixels are : (5,4) and (7,5)
How can I parallelize this entire thing? (Hopefully to be loaded onto a GPU using Pytorch, but not necessarily, I have not come to that stage yet.)
Note: I had asked this question a few days ago, but it was poorly framed without proper info. Hopefully this solves the issue.
Edit: For this specific instance, F can be produced as a random array:
F = np.random.rand(10,10)
If I understand correctly, you want this:
from skimage.util.shape import view_as_windows
idx = pixels[:,0:2].astype(int)
print((np.unravel_index((view_as_windows(F,(3,3))[tuple(idx.T-1)]*pixels[:,-1][:,None,None]).reshape(-1,9).argmax(1),(3,3))+idx.T).T-1)
#if you need to replace the values of F with new values
F[tuple(idx.T)] = (view_as_windows(F,(3,3))[tuple(idx.T-1)]*pixels[:,-1][:,None,None]).reshape(-1,9).max(1)
I assumed your window shape is (3,3). Of course, you can change it. And if you need to deal with edge neighborhoods, pad your F with enough 0s (depending on your window size) using np.pad before using the view_as_windows.
output:
[[4 1]
[5 4]
[7 5]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With