I have a large image as an 2D array (let's assume that it is a 500 by 1000 pixels gray scale image). And I have one small image (let's say is it 15 by 15 pixels). I would like to slide the small image over the large one and for a given position of the small image I would like to calculate a measure of similarity between the small image and the underling part of the big image.
I would like to be flexible in choosing a measure of similarity. For example I might want to calculate mean squared deviation or mean absolute deviation or something else (just some operation that takes two matrices of the same size and returns a real number).
The result should be a 2D array. I want to do this operation efficiently (which means that I want to avoid loops).
As a measure of similarity I plan to use the square deviations between the colors of the two images. However, as I have mentioned, it would be nice if I can change the measure (for example to use correlation between the colors).
Is there a function in numpy that can do it?
To start off, let's import all the relevant modules/functions that would be used across the various approaches listed in this post -
from skimage.util import view_as_windows
from skimage.feature import match_template
import cv2
from cv2 import matchTemplate as cv2m
from scipy.ndimage.filters import uniform_filter as unif2d
from scipy.signal import convolve2d as conv2
Skimage based approaches for computing mean absolute deviation
:
Using scikit-image for getting slided 4D array of views
and then np.mean
for the average calculations -
def skimage_views_MAD_v1(img, tmpl):
return np.abs(view_as_windows(img, tmpl.shape) - tmpl).mean(axis=(2,3))
Using scikit-image for getting slided 4D array of views and then np.einsum
for the squared-average calculations -
def skimage_views_MAD_v2(img, tmpl):
subs = np.abs(view_as_windows(img, tmpl.shape) - tmpl)
return np.einsum('ijkl->ij',subs)/float(tmpl.size)
Skimage based approaches for computing mean squared deviation
:
Using the similar techniques, we would have two approaches for mean squared deviation
-
def skimage_views_MSD_v1(img, tmpl):
return ((view_as_windows(img, tmpl.shape) - tmpl)**2).mean(axis=(2,3))
def skimage_views_MSD_v2(img, tmpl):
subs = view_as_windows(img, tmpl.shape) - tmpl
return np.einsum('ijkl,ijkl->ij',subs, subs)/float(tmpl.size)
Convolution based approaches for computing mean squared deviations
:
Convolution
could be used to used to compute mean squared deviations in a bit of tweaked way. For the case of sum of squared deviations, within each window we are performing elementwise subtractions and then squaring each subtraction and then summing up all those.
Let's consider a 1D example for a closer look -
a : [a1, a2, a3, a4, a5, a6, a7, a8] # Image array
b : [b1, b2, b3] # Template array
For the first window operation, we would have :
(a1-b1)**2 + (a2-b2)**2 + (a3-b3)**2
Let's use (a-b)**2
formula :
(a - b)**2 = a**2 - 2*a*b +b**2
Thus, we would have for the first window :
(a1**2 - 2*a1*b1 +b1**2) + (a2**2 - 2*a2*b2 +b2**2) + (a3**2 - 2*a3*b3 +b3**2)
Similarly, for the second window :
(a2**2 - 2*a2*b1 +b1**2) + (a3**2 - 2*a3*b2 +b2**2) + (a4**2 - 2*a4*b3 +b3**2)
and so on.
So, we have three parts to these computations -
Squaring of a's and summing of those in sliding windows.
Squaring of b's and summing of those. This stays the same among all windows.
Final piece of puzzle is : (2*a1*b1, 2*a2*b2, 2*a3*b3)
, (2*a2*b1, 2*a3*b2, 2*a4*b3)
and so on for the first, second and so on windows. This could be computed by a 2D
convolution with a
and flipped version of b
as per the definition of convolution
that runs the kernel from the other direction in a sliding and computes elementwise multiplication and sums elements within each window and hence the flipped needed here.
Extending these ideas to 2D
case and using Scipy's convolve2d
and uniform_filter
, we would have two more approaches to compute mean squared deviations
, like so -
def convolution_MSD(img, tmpl):
n = tmpl.shape[0]
sums = conv2(img**2,np.ones((n,n)),'valid')
out = sums + (tmpl**2).sum() -2*conv2(img,tmpl[::-1,::-1],'valid')
return out/(n*n)
def uniform_filter_MSD(img, tmpl):
n = tmpl.shape[0]
hWSZ = (n-1)//2
sums = unif2d(img.astype(float)**2,size=(n,n))[hWSZ:-hWSZ,hWSZ:-hWSZ]
out = sums + (tmpl**2).mean() -2*conv2(img,tmpl[::-1,::-1],'valid')/float(n*n)
return out
Skimage based approaches for computing normalized cross-correlation
:
def skimage_match_template(img, tmpl):
return match_template(img, tmpl)
Please note that, since these are cross-correlation values, the closeness between image and template would be characterized by a high output value.
OpenCV offers various template-matching
methods of classifying templates -
def opencv_generic(img, tmpl, method_string ='SQDIFF'):
# Methods :
# 'CCOEFF' : Correlation coefficient
# 'CCOEFF_NORMED' : Correlation coefficient normalized
# 'CCORR' : Cross-correlation
# 'CCORR_NORMED' : Cross-correlation normalized
# 'SQDIFF' : Squared differences
# 'SQDIFF_NORMED' : Squared differences normalized
method = eval('cv2.TM_' + method_string)
return cv2m(img.astype('uint8'),tmpl.astype('uint8'),method)
We could use 4D
views as shown earlier in this post and perform the custom similarity measures as NumPy ufuncs along the last two axes.
Thus, we get the sliding windows as a 4D array as used previously, like so -
img_4D = view_as_windows(img, tmpl.shape)
Please note that being a view into the input image, it won't cost anymore on memory. But the later operations would make copy depending on those operations themselves. A comparison operation would result in much lesser memory footprint (8 times lesser on Linux to be exact).
Then, we perform the intended operation between img_4D
and tmpl
, which in a linearly mapped operation would result in another 4D array following broadcasting
. Let's call it img_sub
. Next up, most probably, we would have some reduction operation to give us a 2D
output. Again, in most cases, one of the NumPy ufuncs
could be used here. We need to use this ufunc along the last two axes on img_sub
. Again, many ufuncs allow us to work on more than one axis at a time. For example, earlier we used mean
computation along last two axes in one go. Otherwise, we need to operate along those two axes one after another.
Example
As an example on how to use let's consider a custom function :
mean((img_W**tmpl)*tmpl - 2*img*tmpl**2)
Here, we have img_W
as the sliding window from img
and tmpl
as usual is the template that is slided across the height and width of img
.
Implemented with two nested loops, we would have :
def func1(a,b):
m1,n1 = a.shape
m2,n2 = b.shape
mo,no = m1-m2+1, n1-n2+1
out = np.empty((mo,no))
for i in range(mo):
for j in range(no):
out[i,j] = ((a[i:i+m2,j:j+n2]**2)*b - 2*a[i:i+m2,j:j+n2]*(b**2)).mean()
return out
Now, using view_as_windows
, we would have a vectorized solution :
def func2(a,b):
a4D = view_as_windows(img, tmpl.shape)
return ((a4D**2)*b - 2*a4D*(b**2)).mean(axis=(2,3))
Runtime test -
In [89]: # Sample image(a) and template(b)
...: a = np.random.randint(4,9,(50,100))
...: b = np.random.randint(2,9,(15,15))
...:
In [90]: %timeit func1(a,b)
1 loops, best of 3: 147 ms per loop
In [91]: %timeit func2(a,b)
100 loops, best of 3: 17.8 ms per loop
Decent sized datasets :
In [94]: # Inputs
...: img = np.random.randint(0,255,(50,100))
...: tmpl = np.random.randint(0,255,(15,15))
...:
In [95]: out1 = skimage_views_MSD_v1(img, tmpl)
...: out2 = skimage_views_MSD_v2(img, tmpl)
...: out3 = convolution_MSD(img, tmpl)
...: out4 = uniform_filter_MSD(img, tmpl)
...: out5 = opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
...:
...: print np.allclose(out1, out2)
...: print np.allclose(out1, out3)
...: print np.allclose(out1, out4)
...: print np.allclose(out1, out5)
...:
True
True
True
True
In [96]: %timeit skimage_views_MSD_v1(img, tmpl)
...: %timeit skimage_views_MSD_v2(img, tmpl)
...: %timeit convolution_MSD(img, tmpl)
...: %timeit uniform_filter_MSD(img, tmpl)
...: %timeit opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
...:
100 loops, best of 3: 8.49 ms per loop
100 loops, best of 3: 3.87 ms per loop
100 loops, best of 3: 5.96 ms per loop
100 loops, best of 3: 3.25 ms per loop
10000 loops, best of 3: 201 µs per loop
For bigger datasizes, depending on the system RAM available, we might have to fall back to methods other than views
method that leaves noticeable memory footprint.
Let's test out on bigger datasizes with the remaining approaches -
In [97]: # Inputs
...: img = np.random.randint(0,255,(500,1000))
...: tmpl = np.random.randint(0,255,(15,15))
...:
In [98]: %timeit convolution_MSD(img, tmpl)
...: %timeit uniform_filter_MSD(img, tmpl)
...: %timeit opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
...:
1 loops, best of 3: 910 ms per loop
1 loops, best of 3: 483 ms per loop
100 loops, best of 3: 16.1 ms per loop
When working with known similarity measures , i.e. one of the six methods listed with OpenCV based template matching method and if we have access to OpenCV, that would be the best one.
Without OpenCV, for a special case like mean squared deviation, we could make use of convolution to have decently efficient approaches.
For generic/custom functions and with decent sized datasizes, we could look into 4D
views to have efficient vectorized solutions. For large datasizes, we might want to use one loop and use 3D
views instead of 4D
with the intention of reducing memory footprint. For really large ones, you might have to fall back to two nested loops.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With