Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

improving code efficiency: standard deviation on sliding windows

Tags:

I am trying to improve function which calculate for each pixel of an image the standard deviation of the pixels located in the neighborhood of the pixel. My function uses two embedded loops to run accross the matrix, and it is the bottleneck of my programme. I guess there is likely a way to improve it by getting rid of the loops thanks to numpy, but I don't know how to proceed. Any advice are welcome!

regards

def sliding_std_dev(image_original,radius=5) :     height, width = image_original.shape     result = np.zeros_like(image_original) # initialize the output matrix     hgt = range(radius,height-radius)     wdt = range(radius,width-radius)     for i in hgt:         for j in wdt:             result[i,j] = np.std(image_original[i-radius:i+radius,j-radius:j+radius])     return result 
like image 752
baptiste pagneux Avatar asked Aug 24 '13 14:08

baptiste pagneux


2 Answers

Cool trick: you can compute the standard deviation given just the sum of squared values and the sum of values in the window.

Therefore, you can compute the standard deviation very fast using a uniform filter on the data:

from scipy.ndimage.filters import uniform_filter  def window_stdev(arr, radius):     c1 = uniform_filter(arr, radius*2, mode='constant', origin=-radius)     c2 = uniform_filter(arr*arr, radius*2, mode='constant', origin=-radius)     return ((c2 - c1*c1)**.5)[:-radius*2+1,:-radius*2+1] 

This is ridiculously faster than the original function. For a 1024x1024 array and a radius of 20, the old function takes 34.11 seconds, and the new function takes 0.11 seconds, a speed-up of 300-fold.


How does this work mathematically? It computes the quantity sqrt(mean(x^2) - mean(x)^2) for each window. We can derive this quantity from the standard deviation sqrt(mean((x - mean(x))^2)) as follows:

Let E be the expectation operator (basically mean()), and X be the random variable of data. Then:

E[(X - E[X])^2]
= E[X^2 - 2X*E[X] + E[X]^2]
= E[X^2] - E[2X*E[X]] + E[E[X]^2] (by the linearity of the expectation operator)
= E[X^2] - 2E[X]*E[X] + E[X]^2 (again by linearity, and the fact that E[X] is a constant)
= E[X^2] - E[X]^2

which proves that the quantity computed using this technique is mathematically equivalent to the standard deviation.

like image 130
nneonneo Avatar answered Oct 02 '22 14:10

nneonneo


The most often used method to do this kind of things in image processing is using summed area tables, an idea introduced in this paper in 1984. The idea is that, when you compute a quantity by adding over a window, and move the window e.g. one pixel to the right, you don't need to add all the items in the new window, you only need to subtract the leftmost column from the total, and add the new rightmost column. So if you create an accumulated sum array over both dimensions from your array, you can get the sum over a window with a couple of sums and a subtraction. If you keep summed area tables for your array and its square, it's very easy to get the variance from those two. Here's an implementation:

def windowed_sum(a, win):     table = np.cumsum(np.cumsum(a, axis=0), axis=1)     win_sum = np.empty(tuple(np.subtract(a.shape, win-1)))     win_sum[0,0] = table[win-1, win-1]     win_sum[0, 1:] = table[win-1, win:] - table[win-1, :-win]     win_sum[1:, 0] = table[win:, win-1] - table[:-win, win-1]     win_sum[1:, 1:] = (table[win:, win:] + table[:-win, :-win] -                        table[win:, :-win] - table[:-win, win:])     return win_sum  def windowed_var(a, win):     win_a = windowed_sum(a, win)     win_a2 = windowed_sum(a*a, win)     return (win_a2 - win_a * win_a / win/ win) / win / win 

To see that this works:

>>> a = np.arange(25).reshape(5,5) >>> windowed_var(a, 3) array([[ 17.33333333,  17.33333333,  17.33333333],        [ 17.33333333,  17.33333333,  17.33333333],        [ 17.33333333,  17.33333333,  17.33333333]]) >>> np.var(a[:3, :3]) 17.333333333333332 >>> np.var(a[-3:, -3:]) 17.333333333333332 

This should run a couple of notches faster than convolution based methods.

like image 36
Jaime Avatar answered Oct 02 '22 14:10

Jaime