Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy

I have a large image as an 2D array (let's assume that it is a 500 by 1000 pixels gray scale image). And I have one small image (let's say is it 15 by 15 pixels). I would like to slide the small image over the large one and for a given position of the small image I would like to calculate a measure of similarity between the small image and the underling part of the big image.

I would like to be flexible in choosing a measure of similarity. For example I might want to calculate mean squared deviation or mean absolute deviation or something else (just some operation that takes two matrices of the same size and returns a real number).

The result should be a 2D array. I want to do this operation efficiently (which means that I want to avoid loops).

As a measure of similarity I plan to use the square deviations between the colors of the two images. However, as I have mentioned, it would be nice if I can change the measure (for example to use correlation between the colors).

Is there a function in numpy that can do it?

like image 809
Roman Avatar asked Dec 26 '16 11:12

Roman


1 Answers

Import modules

To start off, let's import all the relevant modules/functions that would be used across the various approaches listed in this post -

from skimage.util import view_as_windows
from skimage.feature import match_template
import cv2
from cv2 import matchTemplate as cv2m
from scipy.ndimage.filters import uniform_filter as unif2d
from scipy.signal import convolve2d as conv2

A. Views based approach for MAD, MSD

Skimage based approaches for computing mean absolute deviation :

Using scikit-image for getting slided 4D array of views and then np.mean for the average calculations -

def skimage_views_MAD_v1(img, tmpl):
    return np.abs(view_as_windows(img, tmpl.shape) - tmpl).mean(axis=(2,3))

Using scikit-image for getting slided 4D array of views and then np.einsum for the squared-average calculations -

def skimage_views_MAD_v2(img, tmpl):
    subs = np.abs(view_as_windows(img, tmpl.shape) - tmpl)
    return np.einsum('ijkl->ij',subs)/float(tmpl.size)

Skimage based approaches for computing mean squared deviation :

Using the similar techniques, we would have two approaches for mean squared deviation -

def skimage_views_MSD_v1(img, tmpl):
    return ((view_as_windows(img, tmpl.shape) - tmpl)**2).mean(axis=(2,3))

def skimage_views_MSD_v2(img, tmpl):
    subs = view_as_windows(img, tmpl.shape) - tmpl
    return np.einsum('ijkl,ijkl->ij',subs, subs)/float(tmpl.size)

B. Convolution based approach for MSD

Convolution based approaches for computing mean squared deviations :

Convolution could be used to used to compute mean squared deviations in a bit of tweaked way. For the case of sum of squared deviations, within each window we are performing elementwise subtractions and then squaring each subtraction and then summing up all those.

Let's consider a 1D example for a closer look -

a : [a1, a2, a3, a4, a5, a6, a7, a8] # Image array
b : [b1, b2, b3]                     # Template array

For the first window operation, we would have :

(a1-b1)**2 + (a2-b2)**2 + (a3-b3)**2

Let's use (a-b)**2 formula :

(a - b)**2 = a**2 - 2*a*b +b**2

Thus, we would have for the first window :

(a1**2 - 2*a1*b1 +b1**2) + (a2**2 - 2*a2*b2 +b2**2) + (a3**2 - 2*a3*b3 +b3**2)

Similarly, for the second window :

(a2**2 - 2*a2*b1 +b1**2) + (a3**2 - 2*a3*b2 +b2**2) + (a4**2 - 2*a4*b3 +b3**2)

and so on.

So, we have three parts to these computations -

  • Squaring of a's and summing of those in sliding windows.

  • Squaring of b's and summing of those. This stays the same among all windows.

  • Final piece of puzzle is : (2*a1*b1, 2*a2*b2, 2*a3*b3), (2*a2*b1, 2*a3*b2, 2*a4*b3) and so on for the first, second and so on windows. This could be computed by a 2D convolution with a and flipped version of b as per the definition of convolution that runs the kernel from the other direction in a sliding and computes elementwise multiplication and sums elements within each window and hence the flipped needed here.

Extending these ideas to 2D case and using Scipy's convolve2d and uniform_filter, we would have two more approaches to compute mean squared deviations, like so -

def convolution_MSD(img, tmpl):
    n = tmpl.shape[0]
    sums = conv2(img**2,np.ones((n,n)),'valid')
    out = sums + (tmpl**2).sum() -2*conv2(img,tmpl[::-1,::-1],'valid')
    return out/(n*n)

def uniform_filter_MSD(img, tmpl):
    n = tmpl.shape[0]
    hWSZ = (n-1)//2
    sums = unif2d(img.astype(float)**2,size=(n,n))[hWSZ:-hWSZ,hWSZ:-hWSZ]
    out = sums + (tmpl**2).mean() -2*conv2(img,tmpl[::-1,::-1],'valid')/float(n*n)
    return out

C. Skimage based approach for NCC

Skimage based approaches for computing normalized cross-correlation :

def skimage_match_template(img, tmpl):
    return match_template(img, tmpl)

Please note that, since these are cross-correlation values, the closeness between image and template would be characterized by a high output value.


D. OpenCV based approach for various similarity measures

OpenCV offers various template-matching methods of classifying templates -

def opencv_generic(img, tmpl, method_string ='SQDIFF'):
    # Methods : 
    # 'CCOEFF' : Correlation coefficient
    # 'CCOEFF_NORMED' : Correlation coefficient normalized
    # 'CCORR' : Cross-correlation
    # 'CCORR_NORMED' : Cross-correlation normalized
    # 'SQDIFF' : Squared differences
    # 'SQDIFF_NORMED' : Squared differences normalized

    method = eval('cv2.TM_' + method_string)
    return cv2m(img.astype('uint8'),tmpl.astype('uint8'),method)

E. Views based approach : Custom function

We could use 4D views as shown earlier in this post and perform the custom similarity measures as NumPy ufuncs along the last two axes.

Thus, we get the sliding windows as a 4D array as used previously, like so -

img_4D = view_as_windows(img, tmpl.shape)

Please note that being a view into the input image, it won't cost anymore on memory. But the later operations would make copy depending on those operations themselves. A comparison operation would result in much lesser memory footprint (8 times lesser on Linux to be exact).

Then, we perform the intended operation between img_4D and tmpl, which in a linearly mapped operation would result in another 4D array following broadcasting. Let's call it img_sub. Next up, most probably, we would have some reduction operation to give us a 2D output. Again, in most cases, one of the NumPy ufuncs could be used here. We need to use this ufunc along the last two axes on img_sub. Again, many ufuncs allow us to work on more than one axis at a time. For example, earlier we used mean computation along last two axes in one go. Otherwise, we need to operate along those two axes one after another.

Example

As an example on how to use let's consider a custom function :

mean((img_W**tmpl)*tmpl - 2*img*tmpl**2)

Here, we have img_W as the sliding window from img and tmpl as usual is the template that is slided across the height and width of img.

Implemented with two nested loops, we would have :

def func1(a,b):
    m1,n1 = a.shape
    m2,n2 = b.shape
    mo,no = m1-m2+1, n1-n2+1
    out = np.empty((mo,no))
    for i in range(mo):
        for j in range(no):
            out[i,j] = ((a[i:i+m2,j:j+n2]**2)*b - 2*a[i:i+m2,j:j+n2]*(b**2)).mean()
    return out

Now, using view_as_windows, we would have a vectorized solution :

def func2(a,b):
    a4D = view_as_windows(img, tmpl.shape)
    return ((a4D**2)*b - 2*a4D*(b**2)).mean(axis=(2,3))

Runtime test -

In [89]: # Sample image(a) and template(b)
    ...: a = np.random.randint(4,9,(50,100))
    ...: b = np.random.randint(2,9,(15,15))
    ...: 

In [90]: %timeit func1(a,b)
1 loops, best of 3: 147 ms per loop

In [91]: %timeit func2(a,b)
100 loops, best of 3: 17.8 ms per loop

Benchmarking : Mean squared deviation

Decent sized datasets :

In [94]: # Inputs
    ...: img = np.random.randint(0,255,(50,100))
    ...: tmpl = np.random.randint(0,255,(15,15))
    ...: 

In [95]: out1 = skimage_views_MSD_v1(img, tmpl)
    ...: out2 = skimage_views_MSD_v2(img, tmpl)
    ...: out3 = convolution_MSD(img, tmpl)
    ...: out4 = uniform_filter_MSD(img, tmpl)
    ...: out5 = opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
    ...: 
    ...: print np.allclose(out1, out2)
    ...: print np.allclose(out1, out3)
    ...: print np.allclose(out1, out4)
    ...: print np.allclose(out1, out5)
    ...: 
True
True
True
True

In [96]: %timeit skimage_views_MSD_v1(img, tmpl)
    ...: %timeit skimage_views_MSD_v2(img, tmpl)
    ...: %timeit convolution_MSD(img, tmpl)
    ...: %timeit uniform_filter_MSD(img, tmpl)
    ...: %timeit opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
    ...: 
100 loops, best of 3: 8.49 ms per loop
100 loops, best of 3: 3.87 ms per loop
100 loops, best of 3: 5.96 ms per loop
100 loops, best of 3: 3.25 ms per loop
10000 loops, best of 3: 201 µs per loop

For bigger datasizes, depending on the system RAM available, we might have to fall back to methods other than views method that leaves noticeable memory footprint.

Let's test out on bigger datasizes with the remaining approaches -

In [97]: # Inputs
    ...: img = np.random.randint(0,255,(500,1000))
    ...: tmpl = np.random.randint(0,255,(15,15))
    ...: 

In [98]: %timeit convolution_MSD(img, tmpl)
    ...: %timeit uniform_filter_MSD(img, tmpl)
    ...: %timeit opencv_generic(img, tmpl, 'SQDIFF')/tmpl.size
    ...: 
1 loops, best of 3: 910 ms per loop
1 loops, best of 3: 483 ms per loop
100 loops, best of 3: 16.1 ms per loop

Summary

  • When working with known similarity measures , i.e. one of the six methods listed with OpenCV based template matching method and if we have access to OpenCV, that would be the best one.

  • Without OpenCV, for a special case like mean squared deviation, we could make use of convolution to have decently efficient approaches.

  • For generic/custom functions and with decent sized datasizes, we could look into 4D views to have efficient vectorized solutions. For large datasizes, we might want to use one loop and use 3D views instead of 4D with the intention of reducing memory footprint. For really large ones, you might have to fall back to two nested loops.

like image 164
Divakar Avatar answered Nov 14 '22 23:11

Divakar