Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Basic pattern recognition in binary (pixelated) image

Here is a cropped example (about 11x9 pixels) of the kind of images (which ultimately are actually all of size 28x28, but stored in memory flattened as a 784-components array) I will be trying to apply the algorithm on:

sample image

Basically, I want to be able to recognize when this shape appears (red lines are used to put emphasis on the separation of the pixels, while the surrounding black border is used to better outline the image against the white background of StackOverflow):

sample shape

The orientation of it doesn't matter: it must be detected in any of its possible representations (rotations and symmetries) along the horizontal and vertical axis (so, for example, a 45° rotation shouldn't be considered, nor a diagonal symmetry: only consider 90°, 180°, and 270° rotations, for example).

There are two solutions to be found on that image that I first presented, though only one needs to be found (ignore the gray blurr surrounding the white region):

first example highlighted

Take this other sample (which also demonstrates that the white figures inside the images aren't always fully surrounded by black pixels):

example 2

The function should return True because the shape is present:

highlighting the shape

Now, there is obviously a simple solution to this:

Use a variable such as pattern = [[1,0,0,0],[1,1,1,1]], produce its variations, and then slide all of the variations along the image until an exact match is found at which point the whole thing just stops and returns True.

This would, however, in the worst case scenario, take up to 8*(28-2)*(28-4)*(2*4) which is approximately 40000 operations for a single image, which seem a bit overkill (if I did my quick calculations right).

I'm guessing one way of making this naive approach better would be to first of all scan the image until I find the very first white pixel, and then start looking for the pattern 4 rows and 4 columns earlier than that point, but even that doesn't seem good enough.

Any ideas? Maybe this kind of function has already been implemented in some library? I'm looking for an implementation or an algorithm that beats my naive approach.

As a side note, while kind of a hack, I'm guessing this is the kind of problem that can be offloaded to the GPU but I do not have much experience with that. While it wouldn't be what I'm looking for primarily, if you provide an answer, feel free to add a GPU-related note.


EDIT:

I ended up making an implementation of the accepted answer. You can see my code in this Gist.

like image 279
payne Avatar asked Nov 23 '18 04:11

payne


2 Answers

If you have too many operations, think how to do less of them.

For this problem I'd use image integrals.

If you convolve a summing kernel over the image (this is a very fast operation in fft domain with just conv2,imfilter), you know that only locations where the integral is equal to 5 (in your case) are possible pattern matching places. Checking those (even for your 4 rotations) should be computationally very fast. There can not be more than 50 locations in your example image that fit this pattern.

My python is not too fluent, but this is the proof of concept for your first image in MATLAB, I am sure that translating this code should not be a problem.

% get the same image you have (imgur upscaled it and made it RGB)
I=rgb2gray(imread('https://i.stack.imgur.com/l3u4A.png'));
I=imresize(I,[9 11]);
I=double(I>50);

% Integral filter definition (with your desired size)
h=ones(3,4);

% horizontal and vertical filter (because your filter is  not square)
Ifiltv=imfilter(I,h);
Ifilth=imfilter(I,h');
% find the locations where integral is exactly the value you want
[xh,yh]=find(Ifilth==5);
[xv,yv]=find(Ifiltv==5);

% this is just plotting, for completeness
figure()
imshow(I,[]);
hold on
plot(yh,xh,'r.');
plot(yv,xv,'r.');

This results in 14 locations to check. My standard computer takes 230ns on average on computing both image integrals, which I would call fast.

enter image description here

Also GPU computing is not a hack :D. Its the way to go with a big bunch of problems because of the enormous computing power they have. E.g. convolutions in GPUs are incredibly fast.

like image 110
Ander Biguri Avatar answered Nov 04 '22 18:11

Ander Biguri


The operation you are implementing is an operator in Mathematical Morphology called hit and miss.

It can be implemented very efficiently as a composition of two erosions. If the shape you’re detecting can be decomposed into a few simple geometrical shapes (especially rectangles are quick to compute) then the operator can be even more efficient.

You’ll find very efficient erosions in most image processing libraries, for example try OpenCV. OpenCV also has a hit and miss operator, here is a tutorial for how to use it.


As an example for what output to expect, I generated a simple test image (left), applied a hit and miss operator with a template that matches at exactly one place in the image (middle), and again with a template that does not match anywhere (right):

Image showing the three images as described above

I did this in MATLAB, not Python, because I have it open and it's easiest for me to use. This is the code:

se = [1,1,1,1      % Defines the template
      0,0,0,1];
img = [0,0,0,0,0,0 % Defines the test image
       0,1,1,1,1,0
       0,0,0,0,1,0
       0,0,0,0,0,0
       0,0,0,0,0,0
       0,0,0,0,0,0];
img = dip_image(img,'bin');

res1 = hitmiss(img,se);
res2 = hitmiss(img,rot90(se,2));

% Quick-and-dirty display
h = dipshow([img,res1,res2]);
diptruesize(h,'tight',3000)
hold on
plot([5.5,5.5],[-0.5,5.5],'r-')
plot([11.5,11.5],[-0.5,5.5],'r-')

The code above uses the hit and miss operator as I implemented in DIPimage. This same implementation is available in DIPlib's Python bindings as dip.HitAndMiss() (install with pip install diplib):

import diplib as dip
# ...
res = dip.HitAndMiss(img, se)
like image 36
Cris Luengo Avatar answered Nov 04 '22 16:11

Cris Luengo